Data Primary key vs. Artificial (Autonumber) primary key

**Jamie Collins** · December 7th, 2007, 08:37 AM posted to microsoft.public.access.tablesdbdesign

On Dec 6, 4:06 pm, Dennis wrote:
I don't know what you mean but memo fields can be indexed and can be
used in WHERE and HAVING clauses; you can cast them using CSTR() in
SQL code to prevent them from being truncated in certain
circumstances.

Then you're a better man that I, because I could never get them to work for
me in that regard. But that wasn't the first time I've said that, and it
CERTAINLY won't be the last - heh....

Try this:

CREATE TABLE Test1 (memo_col MEMO NOT NULL)
;
CREATE INDEX idx__test1__memo_col ON Test1 (memo_col)
;
CREATE TABLE Test2 (memo_col MEMO NOT NULL PRIMARY KEY)
;
CREATE TABLE Test3 (memo_col MEMO NOT NULL UNIQUE)
;

All the above work in Jet. The real issue is that Jet only tests the
first 255 characters of a MEMO column when testing a unique constraint
and in other contexts e.g. GROUP BY, UNION (but not UNION ALL), etc.
For example:

INSERT INTO Test2 (memo_col) VALUES
('123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a12345x')
;
INSERT INTO Test2 (memo_col) VALUES
('123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a12345z')
;

The above represent different values when considering all characters
but the engine considers them non-unique when checking the PK because
only the first 255 characters are used in the check. Off the top of my
head (so probably not the best), here's one way of achieving a unique
constraint with a MEMO column (ANSI-92 Query Mode syntax):

CREATE TABLE Test4 (
memo_col MEMO NOT NULL,
CONSTRAINT Test4__memo_col__unique
CHECK (1 = (
SELECT COUNT(*)
FROM Test4 AS T4
WHERE CSTR(Test4.memo_col) = CSTR(T4.memo_col))))
;
INSERT INTO Test4 (memo_col) VALUES
('123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a12345x')
;
INSERT INTO Test4 (memo_col) VALUES
('123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a123456789a123456789a123456789a123456789a12345678 9a12345z')
;

Jamie.

--

**Jamie Collins** · December 7th, 2007, 09:04 AM posted to microsoft.public.access.tablesdbdesign

On Dec 6, 6:34 pm, Dale Fye wrote:
I always love these discussions, because you have some purists, some
"religious fanatics", and some simpletons; call me a simpleton. The reason I
use "meaningless" primary keys is that it saves me work!!!

When I want to update a record, or select a record with an autonumber PK, I
type:

WHERE ID = 123

If I use a composite PK, I have to type:

WHERE [somefield] = somevalue
AND [someotherfield] = somevalue
AND [athirdfield] = somevalue
AND ....

This seems like a lot of extra work to me, so I avoid it where I can.

Adding an artificial key to each table introduces data elements not
present in the reality being modelled. This seems like a lot of extra
work to me, so I avoid it where I can. The obvious example is a single-
column lookup table.

I've just got back from a project where an entity's four-column
'natural' key was replaced with a single-column meaningless key. Note,
however, the existing model was flawed because the real world key
would require six columns (not that columns exist in the real world of
course g) and the required data is not currently being captured. So
I guess I'm not a purist because, although I am inclined towards a
natural key e.g. a single column lookup table) I can open my mind up
to an artificial key when appropriate.

If you imagine I spent my time on this project 'simplifying' SQL
queries by trimming down WHERE clauses and reducing the SELECT clause
by three columns then you'd be dead wrong. Those columns involved in
the 'natural' key are still required by the application in most
scenarios so a lot of the donkey work has been creating JOINs back to
the entity tables to find out the real key values, sometimes six
tables deep. I've yet to do the metrics but it seems to me that the
additional table joins have had a noticeable negative impact on
performance.

Another minor issue I have is that the tables are less readable. For
example, looking at the InvoicePayments table I now only see the
meaningless invoice_ID whereas I used to see customer_ number; to view
that I now have to create four JOINs, and typing those ON clauses
wastes any effort I may have saved in the now simplified WHERE clause.

Jamie.

--

**Jamie Collins** · December 7th, 2007, 09:18 AM posted to microsoft.public.access.tablesdbdesign

On Dec 6, 6:50 pm, "Pat Hartman" please no wrote:
The only [con] for using autonumber PKs is that a lot of your tables will end
up with lots of numeric columns that you would prefer to view as text when
you open the table.

I've suggested another in this thread: that you have no influence over
physical ordering on disk in Jet other than by using PRIMARY KEY and
there is virtually no case in favour of clustering on a incremental
Autonumber column, being the Autonumber algorithm of choice because it
generates low value positive integers that are easy to type -- see
Dale Fye's 2 cents in this thread. In other words, putting the PK on
an Autonumber at best wastes an opportunity for better optimization
and at worst causes poor performance.

Access does have an issue
with autonumbers that are not defined as primary keys

What is the issue you are alluding too? TIA.

Jamie.

--

**Jamie Collins** · December 7th, 2007, 09:35 AM posted to microsoft.public.access.tablesdbdesign

On Dec 6, 6:50 pm, "Pat Hartman" please no wrote:
the autonumber would
ALWAYS be the PK and would be used for all relationships. When I have
business rules to satisfy that revolve around uniqueness of "natural" keys,
I use unique indexes.

The main issue I have with the 'Autonumber PK' movement is that the
message about having table constraints on the candidate keys (and
other business rules) doesn't always come across. You've said the
right things (IMO) about so-called surrogates in this thread but I
would wager than most users in the 'Autonumber PK' camp put a PK on
the Autonumber and think, "Job done" as regards table constraints.

Another significant issue is when Autonumber values are exposed to
users. Most informed people agree this should be avoided but again the
message isn't always cleat and so it does happen. In an ideal world,
surrogates would be implemented under the hood and not even be exposed
to DBA, developers, etc. I encourage people to use the random flavour
of Autonumber to discourage this practice (and persistent offenders
should be forced to use GUIDs g).

Jamie.

--

**Jamie Collins** · December 7th, 2007, 09:51 AM posted to microsoft.public.access.tablesdbdesign

On Dec 6, 11:39 pm, "David W. Fenton"
wrote:
=?Utf-8?B?TS4=?= wrote :

The unique indexes are a function of the data, while the
primary key is a function of your database structure.

I don't see how the fact that your SQL DBMS of choice implements
unique constraints using contiguous storage and indexes has anything
to do with the 'function of the data'.

Jamie.

--

**Jamie Collins** · December 7th, 2007, 09:57 AM posted to microsoft.public.access.tablesdbdesign

On Dec 6, 8:03 pm, "Tony Toews [MVP]" wrote:
Consider if you have a parent and subform based a composite PK?
That's a lot of extra work there figuring out what fields go where.

This implies you design your database schema (tables etc) for the
convenience of your front end (forms and subforms) and/or for your own
convenience?! My approach is to design the database schema entirely
independent of the front ends (which is done by another person
anyhow).

Jamie.

--

**Pat Hartman** · December 7th, 2007, 03:59 PM posted to microsoft.public.access.tablesdbdesign

I agree with you that way too many people do not understand the need for
unique indexes to support business rules when using autonumbers as primary
keys. I don't believe I said that autonumbers should be exposed.
Presumably, if there are candidate keys available, they are what the user
would see and use.

Clustered indexes are a double-edged sword and we won't discuss how they
work in "real" databases. In a Jet database, each table is reordered in
sequence by its primary key whenever the database is compacted. That is
what fools people into thinking queries without sort orders will always
return rows in a predictable order. The reordering acts sort of like a
clustered index. However, unlike "real" databases, all new records and some
changed records (if the record size increases) are stored outside of the
clustered sequence and must be reordered when the database is compacted.
Clustered indexes only provide benefit when large numbers of rows are
regularly retrieved in key sequence. When records are retrieved randomly
(one at a time) or based on other attributes, the clustering has no impact
whatsoever and alternate indexes are much more important for optimizing
query performance.

The most useful clustered index that I have ever used segregated a 9 million
row transaction table into months (days actually). Almost all reporting
access was to the most recent three months so the index was extremely
useful. And since the transaction date didn't change, there wasn't an issue
with outliers.

"Jamie Collins" wrote in message
...
On Dec 6, 6:50 pm, "Pat Hartman" please no wrote:
the autonumber would
ALWAYS be the PK and would be used for all relationships. When I have
business rules to satisfy that revolve around uniqueness of "natural"
keys,
I use unique indexes.

The main issue I have with the 'Autonumber PK' movement is that the
message about having table constraints on the candidate keys (and
other business rules) doesn't always come across. You've said the
right things (IMO) about so-called surrogates in this thread but I
would wager than most users in the 'Autonumber PK' camp put a PK on
the Autonumber and think, "Job done" as regards table constraints.

Another significant issue is when Autonumber values are exposed to
users. Most informed people agree this should be avoided but again the
message isn't always cleat and so it does happen. In an ideal world,
surrogates would be implemented under the hood and not even be exposed
to DBA, developers, etc. I encourage people to use the random flavour
of Autonumber to discourage this practice (and persistent offenders
should be forced to use GUIDs g).

Jamie.

--

**Jamie Collins** · December 7th, 2007, 04:32 PM posted to microsoft.public.access.tablesdbdesign

On Dec 7, 3:59 pm, "Pat Hartman" please no wrote:
users. Most informed people agree this should be avoided but again the
message isn't always clear

I don't believe I said that autonumbers should be exposed.

You hadn't said that Autonumbers shouldn't be exposed either ;-)

Clustered indexes are a double-edged sword

Better IMO to try to wield the sword rather than ignore it and risk
stabbing yourself in the foot g.

[BTW I never got that idiom: surely a double-edged blade pierces the
victim much more effectively than a single edged blade, making a
better weapon of attack...?]

Jamie.

--

**John W. Vinson** · December 7th, 2007, 05:22 PM posted to microsoft.public.access.tablesdbdesign

On Fri, 7 Dec 2007 01:35:50 -0800 (PST), Jamie Collins
wrote:

I encourage people to use the random flavour
of Autonumber to discourage this practice (and persistent offenders
should be forced to use GUIDs g).

eeep! Is that allowed under the Geneva Convention?

bg

John W. Vinson [MVP]

**Dale Fye** · December 7th, 2007, 05:37 PM posted to microsoft.public.access.tablesdbdesign

I disagree, on two points.

1. I design my schemas as close to 3rd or 4th normal form as feasible.
But, rather than carry around the excess baggage of a 3, 4, or 5 field PK, I
have elected to take the path that is most efficient given my chosen
front/back tools (Access). However, even when I get the opportunity to take
advantage of the server side processing that SQL Server provides, I still use
a 'meaningless' PKs to speed development (makes my job easier) and minimize
the risk of mistakes (as one of the other posts mentioned, an update query
where one of the joins is missing can be disasterous).

2. I really enjoy the challenge of developing an application that is both
functional and user friendly. Unlike most "programmers", I also consider
myself to be extremely literate in the "database" side of things, and take
great exception with programmers who kludge their database together based on
their desired form layout.

Dale
--
Don''t forget to rate the post if it was helpful!

email address is invalid
Please reply to newsgroup only.

"Jamie Collins" wrote:

On Dec 6, 8:03 pm, "Tony Toews [MVP]" wrote:
Consider if you have a parent and subform based a composite PK?
That's a lot of extra work there figuring out what fields go where.

This implies you design your database schema (tables etc) for the
convenience of your front end (forms and subforms) and/or for your own
convenience?! My approach is to design the database schema entirely
independent of the front ends (which is done by another person
anyhow).

Jamie.

--

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode