If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
Multi-Field Primary Key
So I'm the chump who's being duped by the slick-talking autonumber heretics,
is that it? I can find all sorts of arguments pro and con on any topic, so the existence of advocacy for one position or the other does nothing by itself to sway my views. Autonumber works fine for my purposes, but my point, which you ignored, was that even if I was to accept your position on "natural" keys, what is "natural" about a phone number? Unless you are sorting by phone number to find gaps or something like that the ordering of a phone number field is surely as arbitrary as an autonumber field. "Jamie Collins" wrote in message oups.com... BruceM wrote: I'll just mention that a many people, including Access MVPs, consider autonumber to be a perfectly acceptable and efficient choice for PK. I'll just mention that many people, including the Emperor's most trusted advisors, considered his New Clothes to be a perfectly acceptable choice for a state occasion g. Researching a natural key can involve time and effort (though usually a google search will quickly reveal an industry-standard key, ISO standard, etc). Designing your own key can be hard work. Is there anything simpler than incrementing an integer? Sure, an integer that is incremented for you! It is mere coincidence in this case that 'the easy way out' is the not 'the road less travelled'? Jamie. -- |
#12
|
|||
|
|||
Multi-Field Primary Key
BruceM wrote:
So I'm the chump who's being duped by the slick-talking autonumber heretics, is that it? I can find all sorts of arguments pro and con on any topic, so the existence of advocacy for one position or the other does nothing by itself to sway my views. Autonumber works fine for my purposes, but my point, which you ignored, was that even if I was to accept your position on "natural" keys, what is "natural" about a phone number? Unless you are sorting by phone number to find gaps or something like that the ordering of a phone number field is surely as arbitrary as an autonumber field. Bruce there are two different philosophies at issue here. Surrogate numeric keys (AutoNumber or otherwise) do solve a lot of technical problems for database and database application developers and many seasoned professionals who know what they are talking about use them and recommend them. However; they are not part of proper Relational Database Design Theory. If you are discussing the construction of "proper" relational databases then surrogates will be argued against by the experts almost without exception. If you are discussing the practical pitfalls and solutions of building working databases then you will find a lot of people who make their living doing such things arguing for the use of surrogate keys. These really are two different discussions, but often the boundaries get blurred during exchanges in these groups. I for one have no problem with the judicious use of surrogate keys and often use them myself. However; I avoid making statements like "All Primary Keys should be numeric surrogates" because their use is simply not warranted in every case and my preferences are not applicable in every case. I will point out their advantages when discussing databases with other developers, but I would not fault the decision of another developer not to use them. -- Rick Brandt, Microsoft Access MVP Email (as appropriate) to... RBrandt at Hunter dot com |
#13
|
|||
|
|||
Multi-Field Primary Key
Rick, thanks for the explanation. I have done some reading on the subject,
but usually get bogged down when the discussion gets into clustered indexes, physical ordering on the disk, and the like. I was responding in part because the OP wrote (in response to Jamie): "the worst arrangement is to have the autonumber alone be the primary key because it is completely random and irrelevant and there will be the most disk head movement as I try to fetch phone#'s from the table (assuming I am fetching them in sequential phone# order)." I wondered if adding another essentially arbitrary number (phone #) to the key would somehow improve this situation, assuming that the part about physical location on the disk is essentially accurate in the first place. I would think that it would make more sense, in a case involving a surrogate key as part of a multi-field PK, to have the natural part of the key be something such as LastName that would actually be used for sorting and filtering. If one is going to combine something with autonumber, from the uniqueness point of view it doesn't much matter what that is. Anyhow, I appreciate your adding some comments I can actually begin to understand. By the way, who decided what is "proper"? I'm not disputing the point so much as wondering how it came to be. "Rick Brandt" wrote in message et... BruceM wrote: So I'm the chump who's being duped by the slick-talking autonumber heretics, is that it? I can find all sorts of arguments pro and con on any topic, so the existence of advocacy for one position or the other does nothing by itself to sway my views. Autonumber works fine for my purposes, but my point, which you ignored, was that even if I was to accept your position on "natural" keys, what is "natural" about a phone number? Unless you are sorting by phone number to find gaps or something like that the ordering of a phone number field is surely as arbitrary as an autonumber field. Bruce there are two different philosophies at issue here. Surrogate numeric keys (AutoNumber or otherwise) do solve a lot of technical problems for database and database application developers and many seasoned professionals who know what they are talking about use them and recommend them. However; they are not part of proper Relational Database Design Theory. If you are discussing the construction of "proper" relational databases then surrogates will be argued against by the experts almost without exception. If you are discussing the practical pitfalls and solutions of building working databases then you will find a lot of people who make their living doing such things arguing for the use of surrogate keys. These really are two different discussions, but often the boundaries get blurred during exchanges in these groups. I for one have no problem with the judicious use of surrogate keys and often use them myself. However; I avoid making statements like "All Primary Keys should be numeric surrogates" because their use is simply not warranted in every case and my preferences are not applicable in every case. I will point out their advantages when discussing databases with other developers, but I would not fault the decision of another developer not to use them. -- Rick Brandt, Microsoft Access MVP Email (as appropriate) to... RBrandt at Hunter dot com |
#14
|
|||
|
|||
Multi-Field Primary Key
BruceM wrote:
Rick, thanks for the explanation. I have done some reading on the subject, but usually get bogged down when the discussion gets into clustered indexes, physical ordering on the disk, and the like. I was responding in part because the OP wrote (in response to Jamie): "the worst arrangement is to have the autonumber alone be the primary key because it is completely random and irrelevant and there will be the most disk head movement as I try to fetch phone#'s from the table (assuming I am fetching them in sequential phone# order)." I wondered if adding another essentially arbitrary number (phone #) to the key would somehow improve this situation, assuming that the part about physical location on the disk is essentially accurate in the first place. I would think that it would make more sense, in a case involving a surrogate key as part of a multi-field PK, to have the natural part of the key be something such as LastName that would actually be used for sorting and filtering. If one is going to combine something with autonumber, from the uniqueness point of view it doesn't much matter what that is. Anyhow, I appreciate your adding some comments I can actually begin to understand. By the way, who decided what is "proper"? I'm not disputing the point so much as wondering how it came to be. The theory purists will mostly quote genuflect Codd and Date /genuflect. They pretty much "wrote the books" on relational set theory. With that foundation to build on many "real world practioners" found that adhering too strictly to theory can be really difficult and less productive, so for the sake of problem solving and productivity they deviated from pure theory. Some of these "devious" ;-) practices have become so common-place that many now conflate them to theory in their own right. While they are common and useful one should be aware that they are deviations from relational theory, not part of it. -- Rick Brandt, Microsoft Access MVP Email (as appropriate) to... RBrandt at Hunter dot com |
#15
|
|||
|
|||
Multi-Field Primary Key
"Rick Brandt" wrote in
news The theory purists will mostly quote genuflect Codd and Date /genuflect. They pretty much "wrote the books" on relational set theory. With that foundation to build on many "real world practioners" found that adhering too strictly to theory can be really difficult and less productive, so for the sake of problem solving and productivity they deviated from pure theory. If you're choosing your primary keys based on ordering of the storage, you're way outside the theoretical realm. That's an implementation issue and is going to be answered differently for each different database engine (e.g., Jet will be different from MySQL). I can't imagine a telephone number table that would *need* a PK that didn't involve the parent foreign key. In any event, you'd have to have hundreds of thousands of records in the telephone table for it to make a noticeable difference. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#16
|
|||
|
|||
Multi-Field Primary Key
BruceM wrote:
what is "natural" about a phone number? A natural key can be verified in the reality being modelled (hint: pick up the phone, press the buttons). If you are lucky, the natural key will have a trusted source who will handle the rare (hopefully) occurrences of duplicate entities (flippant and if you are unlucky, the trusted source will be a phone company; Bill Bryson's three laws of natu you can't re-live the past, the waiter can't see you until he's ready and you can't beat the phone company /flippant). In case your question was, "Is phone number better than autonumber as regards clustering?" my answer would be, I suspect not but it will depend on the primary usage of the table. From my experience with (paper) telephone directories, physical ordering on subscriber name (last name, first names) suits my primary usage better. I'll just mention that a many people, including Access MVPs, consider autonumber to be a perfectly acceptable and efficient choice for PK. I'll just mention that many people, including the Emperor's most trusted advisors, considered his New Clothes to be a perfectly acceptable choice for a state occasion g. I can find all sorts of arguments pro and con on any topic, so the existence of advocacy for one position or the other does nothing by itself to sway my views. That was *my* point. The fact you have found all sorts of people who believe an autonumber PK makes a fine PK does not make it true. So I'm the chump who's being duped by the slick-talking autonumber heretics, is that it? You get to choose your own role. In the versions of the tale I've heard, there has been no such role as "person pointing out that there are people who think the Emperor is fully clothed" because, I guess, such a role would be of no relevance to the plot. Me, I'm the small child who, at the risk of public ridicule, questions popular belief when it appears to be based on fallacy. Jamie. -- |
#17
|
|||
|
|||
Multi-Field Primary Key
The OP, explaining his understanding of your comments, wrote "the worst
arrangement is to have the autonumber alone be the primary key because it is completely random and irrelevant". I questioned whether a phone number is less random in terms of ordering. I understand that it is natural in the sense that it corresponds to a real-world application, but it is unlikely to be used for ordering except maybe in the case of a call list. That being the case, combining autonumber with phone number offers dubious advantage at best over autonumber alone. I understand a case may be made for natural keys, but I tend to regard invariable rules with some suspicion. "Jamie Collins" wrote in message ups.com... BruceM wrote: what is "natural" about a phone number? A natural key can be verified in the reality being modelled (hint: pick up the phone, press the buttons). If you are lucky, the natural key will have a trusted source who will handle the rare (hopefully) occurrences of duplicate entities (flippant and if you are unlucky, the trusted source will be a phone company; Bill Bryson's three laws of natu you can't re-live the past, the waiter can't see you until he's ready and you can't beat the phone company /flippant). In case your question was, "Is phone number better than autonumber as regards clustering?" my answer would be, I suspect not but it will depend on the primary usage of the table. From my experience with (paper) telephone directories, physical ordering on subscriber name (last name, first names) suits my primary usage better. I'll just mention that a many people, including Access MVPs, consider autonumber to be a perfectly acceptable and efficient choice for PK. I'll just mention that many people, including the Emperor's most trusted advisors, considered his New Clothes to be a perfectly acceptable choice for a state occasion g. I can find all sorts of arguments pro and con on any topic, so the existence of advocacy for one position or the other does nothing by itself to sway my views. That was *my* point. The fact you have found all sorts of people who believe an autonumber PK makes a fine PK does not make it true. So I'm the chump who's being duped by the slick-talking autonumber heretics, is that it? You get to choose your own role. In the versions of the tale I've heard, there has been no such role as "person pointing out that there are people who think the Emperor is fully clothed" because, I guess, such a role would be of no relevance to the plot. Me, I'm the small child who, at the risk of public ridicule, questions popular belief when it appears to be based on fallacy. Jamie. -- |
#18
|
|||
|
|||
Multi-Field Primary Key
Rick Brandt wrote: there are two different philosophies at issue here. Surrogate numeric keys (AutoNumber or otherwise) do solve a lot of technical problems for database and database application developers and many seasoned professionals who know what they are talking about use them and recommend them. However; they are not part of proper Relational Database Design Theory. If you are discussing the construction of "proper" relational databases then surrogates will be argued against by the experts almost without exception. If you are discussing the practical pitfalls and solutions of building working databases then you will find a lot of people who make their living doing such things arguing for the use of surrogate keys. These really are two different discussions, but often the boundaries get blurred during exchanges in these groups. I for one have no problem with the judicious use of surrogate keys and often use them myself. However; I avoid making statements like "All Primary Keys should be numeric surrogates" because their use is simply not warranted in every case and my preferences are not applicable in every case. I will point out their advantages when discussing databases with other developers, but I would not fault the decision of another developer not to use them. A key uniquely identifies data and the unit of work in SQL is a row (record). 'Primary keys' are a subset of 'keys'. Disregard the 'primary' distinction for a moment. I recognise *three* uses for autonumber. 1) As a so-called surrogate, where the natural key is being constrained and the autonumber is used to 'associate' related tables. The main advantage is that a single column integer is efficient and easier to work with. IMO most people take this approach out of habit rather than to tackle a real life problem. FWIW the 'theorists' are split: Codd suggests a true surrogate should not be visible, à la an index; Date seems to be less strict e.g. might be visible to a DBA. I'm not a theorist and I don't really have a problem with such usage. The problem really is the physical implementation: from a SQL point of view, there is no requirement for a referencing (child) table to physically store the referenced key value (in some SQL products there they are physically one and the same) but in Jet physically repeated values between tables are a reality and sometimes physical considerations (e.g. performance) must be addressed. Natural key + autonumber is a fair compromise because you have a 'relational' key (the natural key) plus a physical connector (autonumber). 2) As a so-called artificial key where no natural key exists i.e. the composite of all the attributes being modelled could be duplicated for distinct entities and modelling more attributes is not an option. You will have to accept that you will never be able to associate entities (e.g. if you want to collect all posts by me - and why wouldn't you g? - then you're our of luck because lots of people post via google, handles can be duplicated, email addresses can be reassigned, sig lines can be faked, personal style can be mimicked, etc). Your database becomes the trusted source. The problem here is that the autonumber must be exposed to users and logically autonumbers aren't the best for real word situations: people hate typing GUIDs (if they can remember one in the first place g); random integers are easily mistyped (hint: a check digit, à la ISBN, helps here); with sequential integers people get bothered by missing values (e.g. auditors). Autonumbers also have physical problems: a sequence must be generated on the same machine, that's why your incrementing autonumber are changed to when using replication, but even random numbers can produce clashes. And consider that autonumber values can be assigned explicitly i.e. using INSERT INTO syntax. On the other hand, designing a key and maintaining its values takes time and effort and Access is all about RAD so I can see the appeal (but I maintain my right to level charges of unimaginative design and contempt for users). 3) As a so-called uniquifier. Designed to ensnare the clueless (subjective, I know, but I can't think of a better way of putting it). You know the message: "Although a primary key isn't required, it's highly recommended. A table must have a primary key for you to define a relationship between this table and other tables in the database. Do you want [Access] to create a primary key now?" Lies, all lies: a) logically, a table *does* require a key, otherwise it's a 'heap'; b) a table does not need a primary key or other key to define a relationship (it needs a key to 'enforce referential integrity' but it doesn't need to be the primary key), rather it merely needs a compatible data type in anther table; c) if you choose 'Yes', Access doesn't create a key, it merely creates an autonumber to 'uniquify' you data. A key should prevent duplicates. Think about it: an autonumber doesn't *prevent* duplicates, rather it *enables* duplicates. Try it: create a single column of type 'number' and create several rows with the same value. say, 1. Now, are those 1s the same or are they all different? The autonumber is there to tell you they are all different but do you believe it? In case it's not clear, it's this third flavour of autonumber I have a problem with. Back to 'primary key'. The term can be traced back to an early mistake made by Ted Codd. You have number of 'candidate key' and you pick one to be 'primary key'. He later realised that all keys are equal but by then it was too late. Relational theory has moved on but SQL took PRIMARY KEY (SQL keywords in uppercase) and now has to deal with it as a legacy issue. First, you don't *need* PRIMARY KEY. If you make your existing primary key columns NOT NULL and constrained them with UNIQUE you would suffer no loss of data integrity because a NOT NULL UNIQUE key is a key and all keys are equal. The NOT NULL property is required for equivalence because a PRIMARY KEY cannot comprise a NULL value. Second, while the idea of PRIMARY KEY is the particular SQL implementation - each SQL product e.g. Jet - would give special meaning to the PRIMARY KEY. This is why everyone says that a SQL table should have a primary key: if it didn't have one at the very worst you would be missing out on the special benefits that PRIMARY KEY has to offer. David W Fenton (downthread) is correct: these special benefits vary from product to product. For Access/Jet, they include: 1) Clustering (physical ordering) on disk on compact; 2) Appearing as bold text in the Relationships diagram; 3) In absence of explicit definition, it will be used when the table is referenced in a relationship or FOREIGN KEY (the letter behaviour is defined in the SQL-92 standard); 4) Prevents you from being nagged, "Although a primary key isn't required..." g. Jamie. -- |
#19
|
|||
|
|||
Multi-Field Primary Key
BruceM wrote: The OP, explaining his understanding of your comments, wrote "the worst arrangement is to have the autonumber alone be the primary key because it is completely random and irrelevant". The OP also wrote, "My idea to have PHONE+AUTONUMBER would be good because at least the phone #'s will be sequentially next to each other on the physical disk." I conclude the OP has understood the issues and physically contiguous PHONE values is good for them. I'm satisifed 'duty of care' has been served. Jamie. -- |
#20
|
|||
|
|||
Multi-Field Primary Key
Jamie Collins wrote: First, you don't *need* PRIMARY KEY. If you make your existing primary key columns NOT NULL and constrained them with UNIQUE you would suffer no loss of data integrity because a NOT NULL UNIQUE key is a key and all keys are equal. The NOT NULL property is required for equivalence because a PRIMARY KEY cannot comprise a NULL value. Second, while the idea of PRIMARY KEY is the particular SQL implementation - each SQL product e.g. Jet - would give special meaning to the PRIMARY KEY. This is why everyone says that a SQL table should have a primary key: if it didn't have one at the very worst you would be missing out on the special benefits that PRIMARY KEY has to offer. There are many articles pointing out that the choice of which candidate key should be primary is *arbitrary* but here's one that seems to make the point well: http://www.aisintl.com/case/relational_keys.html Relational Keys "The primary key of any table is any candidate key of that table which the database designer arbitrarily designates as "primary". The primary key may be selected for convenience, comprehension, performance, or any other reasons. It is entirely proper (albeit often inconvenient) to change the selection of primary key to another candidate key. [...] there is no property of a primary key which is not shared by all other candidate keys in of the table except this arbitrary designation. Unfortunately E-R methodologies and RDBMS products have come to rely on the primary key almost to the exclusion of the concept of candidate keys, which are rarely supported by software." Note, Access/Jet supports candidate keys via NOT NULL UNIQUE. Jamie. -- |
|
Thread Tools | |
Display Modes | |
|
|