View Single Post
  #15  
Old August 17th, 2005, 01:14 PM
Jamie Collins
external usenet poster
 
Posts: n/a
Default


Jeff Boyce wrote:
I have seen considerable debate about using autonumber primary keys. One
school appears to subscribe to the notion that a natural key can (and
perhaps even 'must') be found for every row/record.


Good timing: I put an autonumber (incrementing INTEGER) into a database
just one hour ago. I needed to know the relative order in which rows
were being added to the table and autonumber is convenient. The table
already had a natural key being a composite of the entity's attributes,
in the Dr Dodd tradition; I'd have been in a whole lot of trouble if I
didn't.

I've looked after databases where a natural key exists and an
autonumber is used as an 'alternate' key. I haven't sought to change
this because there is some justification for this approach e.g. a
single INTEGER column foreign key is more convenient to code than a
multi- text column key. There may be some storage and performance
benefits but I haven't investigated this myself, a correct data model
being my primary consideration.

Another school
subscribes to the notion that an autonumber primary key simply provides a
unique row/record identifier.


Using an autonumber as a key where no natural key exists is asking for
trouble. It is uniqueness merely for the sake of it (I'd rather have a
'heap' i.e. without a key and technically not a table at all). You must
expose the autonumber for it to be useful and almost everyone agrees
you should never expose an autonumber in this way.

We need to be clear about the meaning of PRIMARY KEY in Access/Jet.
Here is a pertinent newsgroup post from Joe Celko: "Dr. Codd started
with the concept of a PRIMARY KEY in his first writing; it was the old
sort key from sequential file processing and magnetic tape storage in
disguise. It was awhile before Dr. Codd changed his mind and said that
all keys are equally keys, and we don't need a special one in a
relational database, like you did in a tape system. Unfortunately, the
first SQL systems were build on existing file systems and the idea of
the PRIMARY KEY had become part of the language... The UNIQUE
constraint lets you have multiple keys on a table." So in relational
terms, there is no difference between a key defined as UNQIUE and a key
defined as PRIMARY KEY.

In the Access/Jet implementation, PRIMARY KEY has special meaning but
it has nothing to do with relational keys. Allow me to quote myself
g: "PRIMARY KEY has special meaning for Jet [Access]. In relational
terms, a key is a key and PRIMARY KEY has no special meaning. However,
PRIMARY KEY has traditionally taken on a meaning particular to each SQL
product and Jet is no exception. For Jet, the special meaning is that
PRIMARY KEY provides the only means of specifying the clustered index
for a table. The clustered index is non-maintained, meaning the index
is only 'physically' rebuilt when the file is compacted; it is,
however, still maintained 'logically' between compacts. Some people
have a problem with the term 'clustered index' as regards Jet (don't
make the mistake on thinking 'not supported' means 'does not exist') so
let me clarify: PRIMARY KEY determines the physical ordering on disk.
For best performance, a table should be physically ordered on the
columns of its 'primary' use i.e. those used in GROUP BY and BETWEEN
clauses. Think of a paper copy telephone directory, which can obviously
have only one physical order: if I'm using it primarily for search such
as 'get listings for people whose last name begins with the letter C'
then I'd want it ordered on last name because it would take me no time
at all to rip out the required pages. Similarly, it pays to choose your
PRIMARY KEY carefully and in accordance with its implications for
physical ordering on disk. If you are using PRIMARY KEY to mean a
UNIQUE constraint/index only then your database may be taking a
performance every time a table with a poorly chosen PRIMARY KEY is
queried (pay no attention to the message that tells you, 'A table must
have a primary key for you to define a relationship between this table
and other tables in the database,' it lies, it lies). And this is why
an autonumber is usually a poor choice for PRIMARY KEY: how useful
would you find a paper copy telephone directory physically ordered on
telephone number?"

Given that PRIMARY KEY affects physical ordering and nothing else, I
can see no justification for an autonumber *solely* being PRIMARY KEY.
There may, however, be some justification for including an autonumber
in a *composite* PRIMARY KEY assuming it was the last column specified
in the PK's definition e.g.
PRIMARY KEY (key_col1, key_col2, ..., autonumber_col).

Jamie.

--