Separate PK in Jxn Tbl?

**salad**

Sylvain Lafontaine wrote:
I concede the point that for the two keys of the junction table, using an
autonumber primary key is overkill except for special situations.

Shouldn't a database be designed right from the beginning?

I prefer KISS.

Thus I prefer an autonumber.

Then again, junction tables are rarely needed.

**JOG**

On Jan 29, 8:37 pm, Marshall wrote:
On Jan 29, 7:56 am, "Neil" wrote:

"Jamie Collins" wrote in message

Here in the UK I avoid using the word 'moot' when trying to write
'plain English' simply because the US usage has obscured the UK usage
i.e. it can cause confusion.

You could use the alternate "moo" point, meaning that the point is full of
methane gas. At least it would be clear what you mean.... ;-)

Amusingly, a lot of people misperceive the word as "mute."

This probably comes from you fellas over the pond not pronouncing
words (*cough*) correctly. Mute should of course be pronounced at
though there was a y between the m and the u!

In fact, I challenge you to a dool next Toosday morning to settle the
matter, right after I've eaten my moosli ;P

It's a mute point, meaning it can't say anything any longer.
It sorta vaguely works in a metaphoric way.

Of course, on the internet, you can find many examples of
fractured usage. Ultimately it's just a waist of time.snicker

Marshall

**Brian Selzer**

"Salad" wrote in message
...
Sylvain Lafontaine wrote:
I concede the point that for the two keys of the junction table, using an
autonumber primary key is overkill except for special situations.

Shouldn't a database be designed right from the beginning?

I prefer KISS.

Thus I prefer an autonumber.

Then again, junction tables are rarely needed.

Simple is as Simple does.

**Tony Toews [MVP]**

-CELKO- wrote:

Let's take an example that is a very strong natural key -- (longitude,
latitude). Established for centuries. Well-defined operations, etc.
Validation can be done by GPS or a few million maps. Can you explain
how this immutable key gets changed more often that some "synthetic
key" for locations?

Datums - NAD27, NAD83, and WGS84.all have slightly different meanings for the same
latitude and longitude for North America.
http://en.wikipedia.org/wiki/Geodetic_datum

And there are different latitudes http://en.wikipedia.org/wiki/Latitude

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/

**Tony Toews [MVP]**

Frank Hamersley wrote:

Therein lies its criminality g - it screams encouragement for dabblers
and barely offers anything for artisans except stupendous numbers of
mouse clicks!

OTOH do you want them using Excel for data management? Single user?

A friend was telling me of a huge Excel file with many workbooks where data
normalizing meant going through the cells looking for occurances of similar names and
ensuring they all had the same name.

And, of course, Excel is single user as far as I know.

Take for instance the number of versions it took before separating the
data from the "code" was a core feature by way of the provision of a
menu option to reattach a data .mdb!

A97 has the wizard. I don't recall about previous versions.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/

**Tony Toews [MVP]**

"Keith Wilby" wrote:

But then again the US don't have colour television programmes either do
they? Or aluminium ;-)

We in Canada do have colour TVs, although I don't actually own one. But we don't
have aluminium. So where does that put us?

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/

**Jamie Collins**

On Jan 29, 8:37 pm, "Sylvain Lafontaine" sylvain aei ca (fill the
blanks, no spam please) wrote:
I was making a direct reference to the following quote:

In a recent thread on this subject, Tony Toews Access MVP qualified
that he liked using incremental autonumbers (rather than random)
because they where easier to type (WHERE ID = -2001736589 may
encourage typos) and easier to drop into conversation ("Hello Tony?
I'm seeing a problem with the record where the ID is -2001736589...").

Of course, theoritically and in a world with unlimited budget, you're right
in the sense that a surrogate key should never cross the boundaries of a
database (the interface beeing located inside in these boundaries) but my
clients don't have infinite budget and my brain isn't infinite either.

OK since you admit you raised the 'budget' issue, what do *you* think
the impact on 'budget' is when choosing random autonumber over
incremental autonumber?

My understanding is that Tony Toews Access MVP chooses incremental
autonumber because he likes to refer to a row (entity) using an
address (or possibly by position) i.e. he doesn't choose random
autonumber because he doesn't like the values it would generate (too
many digits, negative values, etc). I offer this to discredit your
assertion, "for those who are using surrogate keys, the exact value of
an address inside the database has zero importance".

Again, I ask you to consider the posts we see in the Access groups
asking to reseed an incremental autonumber back to one or because they
are perplexed/outraged that gaps have appeared in their autonumber
sequences. If the exact values of these so-called surrogates have
"zero importance" to these people then why are they making these
requests?

PS what about the other comments you seemingly aimed at me: codifying,
reengineering, etc? Why no mention of them in your reply?

Jamie.

--

**Sylvain Lafontaine**

Sorry if I didn't responded before but I wasn't sure to understand what you
have wrote. I partage your opinion that a composite key can not only
unequivocally identifying any item in a particular database but also remain
constant (ie. never change from database state to database state) in many
systems.

But as I said, this is not true for all systems and there are occasions
where the value will change from state to state. For instance, is instead
of an inventory system you have a sport ligue system, with a table making a
jonction between a list of players and a list of team; it's easy to see that
in such a table, the composite key may change its value from state to state;
for example when a player is exchanged between two teams.

I don't want to enter into a discussion of the full range of possibilities
(for example, do you want the database to remember the previous
relationship, what about the player number (if a player change his/her
number, do you want to keep the older information?), the individual
statistics, etc., etc.) but it doesn't take too long to see that in such a
situation, the use of a composite key to express the relationships between
tables will rapidly become like hell. Like someone else has said: « been
there, done that » and personally, it's not my intention to go back there.
Everyone know that when it's time to make a decision, one personal
experience has more weight than a thousand opinions so for me, my first
reaction about using a composite primay key will be a no go.

But there are other occasions, like your inventory system, where the
possibility that a composite primary key can change its value don't exist.
In these occasions, would it be overkill to use a separate primary key?
Personally, I don't mind using a separate primary key even on these
occasions but I fully understand that other people might feel unconfortable
to do the same. Everyone know that the devil like to hide in the details;
probably that besides the details, the composite keys are also one of his
favorite places to hide.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)

"Brian Selzer" wrote in message
...

"Sylvain Lafontaine" sylvain aei ca (fill the blanks, no spam please)
wrote in message ...
To that, I would add that the increased simplicity of using a surrogate
(or artificial or autonumber) key as the primary key in place of a
composite key is only half their advantage.

The biggest problem that I have with composite keys is that they share
the same fundamental problem as natural keys: using them as the primary
key is allowing the fact that a primary key can change its value over
time. IMHO, a primary key should never be allowed to change its value
once it has been created; a assumption which will forbid the use of a
composite key in many cases. (Of course, if you don't mind to see a
primary key changing its value after its creation then you are not
concerned by this argument.).

This argument has an inherent fallacy in it. Just because a key is
composed from multiple columns doesn't necessarily mean that its values
can be different in different database states. For example, in an
Inventory table that has the key, {ItemKey, WarehouseKey}, with references
to an Item table and a Warehouse table respectively, the combination
values that comprise each key value can never change from database state
to database state. A particular combination of values identifies a
particular individual in the Universe of Discourse in /every/ database
state in which it appears. It can /never/ identify any other individual.
Therefore, it should be obvious that adding an additional autonumber
primary key in this instance would be superfluous, since each {ItemKey,
WarehouseKey} combination already rigidly designates a distinct individual
in the Universe of Discourse.

The same can be said for many natural keys. For example, suppose you have
a table, Queue, that has an integer key, {Position}. Each value for
Position rigidly designates a distinct individual in the Universe of
Discourse (3 always means "third in line" in any database state in which
there are 3 or more elements), so therefore there is no need for an
additional autonumber primary key.

This is not only a theoritical argument as many interfaces - like
Access - won't like to see a primary key that could change it value. But
even if you take out such interfaces out of the equation, the use of a
surrogate key for all tables reveals itself to be advantageous in many
database problems. For example, if you want to add a log of all changes
to a table, it's much more easier to design it if the table use a
surrogate key for its primary key than a natural key or a composite key.

Personally, I stopped using natural keys and composite keys many years
ago and probably that something like half of my problems with the design
of databases have vanished with them. On these occasions when I was
called to work on a problematic database, chances was much higher to see
that the problems were associated with the use of natural keys and/or
composite keys than with the use of a surrogate keys and the solutions
were usually much more complicated to solve in the first case than in the
second case.

Also, I've remember some peoples who have done like me and have stopped
using natural and composite keys in favor of the exclusive use of
surrogate keys but I don't remember anyone doing the opposite; ie. going
from the use of surrogate keys to the use of natural and composite keys.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)

wrote in message
...
On Jan 25, 9:12 am, Jamie Collins wrote:
On Jan 24, 11:00 pm, "James A. Fortune"
wrote:

Whenever I have multiple key fields, natural or not, I create an
AutoNumber PK for pragmatic reasons. The main reason is that it makes
it easier to create the joins. The theorists are champions at joining
tables and don't have to be concerned with the complexity of the SQL
they write.

Word to the wise: 'theorists' hate SQL.

Queries involving many to many relationships often add
additional tables later and highlight the need to keep joins as simple
as possible.

I think I'm with Bob Badour (gulp!) on the issue of complexity,
though: if you think more columns in the ON clause makes a SQL join
more 'complex' then I think you could be looking at things wrong.
Having more characters to type increases the risk of typos? More
columns mean you may omit one in error? The SQL engine may be twice as
slow in handling two columns rather than one? Is it more 'complex' to
split a post address into 'subatomic' columns (address lines from
postal code/zip etc)?

Surely the issue you allude to (I think) is the one that Access
Relationships (as distinct from Jet foreign keys) were invented to
solve? i.e. you pre-define the join columns and 'join type' (inner
join, left outer join or right outer join) and the join clause gets
written as SQL for you when you drop the tables into the Query Builder
thing. I would have thought the 'theorists' would love the fact that
you also create foreign keys in the same Relationships dialog i.e. you
end up with a natural join (not having to explicitly specify the
columns yourself) because one table references the other.

[I tend to be dismissive of tools that write SQL code for me but I
think I should perhaps review my stance e.g. I still write all my SQL
Server procs by hand whereas I have tasked myself to investigate CRUD
generators. But, for the time being, ...] As a SQL coder myself, I
find it more annoying that I have to create multiple joins to get the
'natural key' values, having to discover what the 'artificial key'
columns are in the first place.

Lately, I've increased the amount of normalization in one
of my databases and the joins got even more complicated, adding about
a
line or so in the SQL view in Access for every new query using those
tables.

Bad luck: I think you might have got way with "reduced the amount of
denormalization" ;-) In this thread I've already broken my personal
rule (!!) about not mentioning normalization [formulated because the
average 'replier' around here thinks "fully normalized" is BCNF, which
they think is 3NF anyhow, and doesn't pay much attention to anomalies
that normalization doesn't address, unless the 'asker' mentions
storing calculations...]

I keep Jamie's advice in the back of my mind, about how enforcing
constraints at the table level is better than enforcing them through
code

..and best to do it in both places! Bear in mind that it's a rule of
thumb i.e. "strict rules modified in practise." Checking something in
the in front end allows you to give timely user feedback and could
save them some keying, not to mention a database roundtrip. Checking
in the database catches anything neglected in the front end by
omission of validation or introduction of bugs. In practice, some
things are better done in one place but not the other: contrast the
validation of the basic pattern of an email address with the
verification that an addressable entity can be contacted at that email
address; I don't think it would be sensible to put the latter test
into a table constraint, even if it were possible.

Jamie.

--

What part of simpler don't you understand :-). Only one expression in
the ON is simpler. Needing less indexes is simpler. Not having to
look for your multi-key fields is easier, although your point that
Relationships can handle that is valid. If the AutoNumber key has a
one-to-one relationship with the multi-key fields then it's fine to
use it. There's no down side that I can see. I also like to rely on
coding to detect inconsistent data rather than on error trapping, so I
have to check the multi-key values anyway before adding a new record.
I think that your idea about enforcing constraints at both the table
level and in code is an excellent idea. The OP wanted to know what
people did and why. I still don't see any reason put forward for me
to change to a multi-field key. Are totals queries easier when multi-
field keys are used? BTW, "reduced the amount of denormalization"
works just as well. Real databases experience denormalizing
influences.

James A. Fortune

**Jamie Collins**

On Jan 30, 2:52*am, "Tony Toews [MVP]"
Datums - NAD27, NAD83, and WGS84.all have slightly different meanings for the same
latitude and longitude for North America

Do you mean a bit like time zones? Should I avoid using them and
instead roll my own 'local time'?

And there are different latitudes

Do you mean a bit like there are ISBN-10 and ISBN-13. Should I avoid
using them and roll my own 'BookID'?

Jamie.

--

**Sylvain Lafontaine**

OK since you admit you raised the 'budget' issue, what do *you* think
the impact on 'budget' is when choosing random autonumber over
incremental autonumber?

If you strictly take a look at performance considerations, random
autonumbers don't scale well with indexes. Most (but not all) systems will
probably scale better if you are using a monotonously increasing primary
key. A monotonously increasing sequence is also probably easier to read than
a set of random number when you are debugging a system. The fact that I'm
not interested in the value of any particular primary key doesn't mean that
I won't mind to see a collection of primary key values to have the property
to be a sequence of monotonously increasing (or decreasing, especially in
the case of some replication scenarios) numbers. Like I said in my previous
post, my brain is like the budget of my clients: it's not infinite either.

However, there are of course some scenarios where a random autonumber will
be used. This is often used in replicated systems; particularly when there
are online/offline clients.

Again, I ask you to consider the posts we see in the Access groups
asking to reseed an incremental autonumber back to one or because they
are perplexed/outraged that gaps have appeared in their autonumber
sequences. If the exact values of these so-called surrogates have
"zero importance" to these people then why are they making these
requests?

You don't see this only in Access groups. The same question is often asked
on other groups as well: searching Google Groups for IDENTITY_INSERT reveals
6520 hits:

http://groups.google.com/groups/sear...DENTITY_INSERT

PS what about the other comments you seemingly aimed at me: codifying,
reengineering, etc? Why no mention of them in your reply?

I seemingly aimed no comment at you. (And usually - but not always - at
anyone else.). This is a thread with a great number of messages posted by
multiple peoples and my comments are destined to anyone interested in
reading my posts. Maybe my style of writing is not enough impersonal but
practically all my posts should be read as being impersonal; excerpt for the
inclusion of the usual forms of politeness.

As for why I don't mention everything in my replies, I volontairily do so
because I try to reply only when I've something new to say. When I wrote
something, if somebody else is not able to understand it - without making
any assumption here on why he/her don't understand it - quite probably that
he/her won't be able to understand it again if I repeat myself. Usually, I
don't feel the need to try to have the last word. There are a lot of people
that will read these messages and I think that they are quite capable of
forging their own opinion even if I didn't repeated the same thing over and
over again or didn't posted last. Of course, my mind is not perfect; so I
do repeat myself or make a last post from time to time.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)

"Jamie Collins" wrote in message
...
On Jan 29, 8:37 pm, "Sylvain Lafontaine" sylvain aei ca (fill the
blanks, no spam please) wrote:
I was making a direct reference to the following quote:

In a recent thread on this subject, Tony Toews Access MVP qualified
that he liked using incremental autonumbers (rather than random)
because they where easier to type (WHERE ID = -2001736589 may
encourage typos) and easier to drop into conversation ("Hello Tony?
I'm seeing a problem with the record where the ID is -2001736589...").

Of course, theoritically and in a world with unlimited budget, you're
right
in the sense that a surrogate key should never cross the boundaries of a
database (the interface beeing located inside in these boundaries) but my
clients don't have infinite budget and my brain isn't infinite either.

OK since you admit you raised the 'budget' issue, what do *you* think
the impact on 'budget' is when choosing random autonumber over
incremental autonumber?

My understanding is that Tony Toews Access MVP chooses incremental
autonumber because he likes to refer to a row (entity) using an
address (or possibly by position) i.e. he doesn't choose random
autonumber because he doesn't like the values it would generate (too
many digits, negative values, etc). I offer this to discredit your
assertion, "for those who are using surrogate keys, the exact value of
an address inside the database has zero importance".

Again, I ask you to consider the posts we see in the Access groups
asking to reseed an incremental autonumber back to one or because they
are perplexed/outraged that gaps have appeared in their autonumber
sequences. If the exact values of these so-called surrogates have
"zero importance" to these people then why are they making these
requests?

PS what about the other comments you seemingly aimed at me: codifying,
reengineering, etc? Why no mention of them in your reply?

Jamie.

--

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode