If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
I'm not sure what you just said. What do you mean by "true key value"? What
is a mapping exercise. If I use employeeID as the PK I am not using a PK that is likely to be included in OrderBy clauses, although I will group by employeeID in order to have each employee's record on a separate page or something like that. In that case I group by employee ID, and order by FullName, which is a concatenated field in the report's underlying query. In a company of about 70 employees there have been no duplicates in FullName, but if there are I will order by something additional. My question to you is what do you see as the advantage of updating the records if the PK value changes in nature (as in the example of EmployeeID). Why is it not simpler and cleaner to avoid that possibility? What would you use as PK in an employee records database? "onedaywhen" wrote: BruceM wrote: I have heard of instances where a company changed the format of an EmployeeID number or some other number (by appending a 0, for instance, to allow for more IDs than the current system allowed). Had EmployeeID been used as the primary key, or as part of a combined PK, it could have been pretty messy. Actually, having the true key value in each of the referencing tables would make this quite a straightforward mapping exercise. You'd have to suspend the constraints during the process, of course. Jamie. -- |
#12
|
|||
|
|||
BruceM wrote: What do you mean by "true key value"? I think I meant a 'natural' key. An autonumber is not a key. In your brand new Employees table with its autonumber PK, insert a row for an employee and you get ID=1. Then delete the row and insert the exact same details again. Now you get ID=2. You have two different IDs that refer to the same employee. Your ID column is not a key. In a company of about 70 employees there have been no duplicates in FullName You are aware full name is not likely to be unique, that's good. What would you use as PK in an employee records database? In my country, the government likes to keep tabs on its subjects; something to do with paying tax and qualifying for state benefits, I'm told. It has a department that issues each individual with a 9 character unique identifier. If a new employee can't tell their ID, the employer must construct a temporary number using date of birth and a letter denoting sex and the tax department gets back to them fairly promptly with the actual ID (AFAIK notifying the tax office of a new employee is a legal requirement). When an employer takes on two people within a short period with the same sex and birth date, I guess they either ask one of them to try to find out their ID (usually found on an old payslip) or phone to tax office to see if they can tell the actual IDs. I understand in certain large countries in North America they have something similar, known as a social security number (SSN). What is a mapping exercise You proposed a theoretical scenario where a column (EmployeeID) needed to be changed. Copying the data from the old to the new is what I'm calling a mapping exercise i.e. you have to map (copy) the data from each row in the old table into the corresponding row in the new table. what do you see as the advantage of updating the records if the PK value changes in nature (as in the example of EmployeeID). When doing data conversion/mapping exercises, you have to be sure your actual results correspond to you expected results. I guess there is no disadvantage to using an integer value to identify each row for these purposes but I think I would be happier seeing something that was obviously a key, rather than a simple row identifier. Why is it not simpler and cleaner to avoid that possibility? Let me propose another theoretical scenario: something happens to your employee table and your autonumber column is lost. You now have an EarningsHistory table where each employee is identified using an integer which has no meaning without the original values from your Employees table. Consider the same EarningsHistory table where each row had a unique identifier such as the aforementioned SSN, something verifiable in reality e.g. phone the tax department and ask them which employee has a certain SSN. Do you now see what I mean when I say I feel 'happier' using a real key in the referencing tables? Jamie. -- |
#13
|
|||
|
|||
"onedaywhen" wrote: BruceM wrote: What do you mean by "true key value"? I think I meant a 'natural' key. An autonumber is not a key. In your brand new Employees table with its autonumber PK, insert a row for an employee and you get ID=1. Then delete the row and insert the exact same details again. Now you get ID=2. You have two different IDs that refer to the same employee. Your ID column is not a key. I'm missing your point here. If I delete the record that contains autonumber ID=1 then there is no record containing ID=1. This would be a problem only if I had related records in other tables. In a company of about 70 employees there have been no duplicates in FullName You are aware full name is not likely to be unique, that's good. I was referring to FullName as an Order By field in a report. As a concatenated field in a query it can't have a PK in any case. Combining FirstName and LastName as a PK would be risking duplication. What is a mapping exercise You proposed a theoretical scenario where a column (EmployeeID) needed to be changed. Copying the data from the old to the new is what I'm calling a mapping exercise i.e. you have to map (copy) the data from each row in the old table into the corresponding row in the new table. Got it. I thought you meant something like that, but wasn't sure. Let me propose another theoretical scenario: something happens to your employee table and your autonumber column is lost. You now have an EarningsHistory table where each employee is identified using an integer which has no meaning without the original values from your Employees table. Consider the same EarningsHistory table where each row had a unique identifier such as the aforementioned SSN, something verifiable in reality e.g. phone the tax department and ask them which employee has a certain SSN. Do you now see what I mean when I say I feel 'happier' using a real key in the referencing tables? Yes, I see your point about being able to reconstruct the database if the SSN field is lost. Although I don't quite see how a single field can be lost (a field involved in a relationship can't be deleted until the relationship is undone) I think I would prefer reconstructing the PK field if it came to that. As long as I keep the records ordered by the autonumber field I could add a new autonumber field, then reorder the related table by the FK field and replace all occurrences of the lowest number with 1, of the second lowest with 2, etc. (or something like that). This is assuming my backup also went bad. I'm not trying to be contentious here. I really do appreciate your taking the time to reply and to explain in further detail. I have a feeling that neither of us is likely to change the other's mind, but you do raise some points worth considering. Jamie. -- |
#14
|
|||
|
|||
BruceM wrote: I'm missing your point here. If I delete the record that contains autonumber ID=1 then there is no record containing ID=1. The point is SSN always identifies an entity in reality. I was referring to FullName as an Order By field in a report. As a concatenated field in a query it can't have a PK in any case. Combining FirstName and LastName as a PK would be risking duplication. what the autonumer (IDENTITY) is for, i.e. to eliminate the possibility of duplicates, so make your compound PK be (LastName, FirstName, ID) in that specific order. I think you are placing the wrong significance on the meaning of primary key. For example, consider this: CREATE TABLE Workers ( ID INTEGER IDENTITY(1,1) NOT NULL UNIQUE, last_name VARCHAR(35) NOT NULL, first_name VARCHAR(35) NOT NULL, PRIMARY KEY (last_name, first_name, ID) ) ; CREATE TABLE Wages ( ID INTEGER NOT NULL REFERENCES Workers (ID) ON UPDATE CASCADE ON DELETE CASCADE, start_date DATETIME NOT NULL, end_date DATETIME, wage CURRENCY NOT NULL, PRIMARY KEY (start_date, ID) ) ; In other words, there is no rule (in either Jet or relational theory) that says a FK in one table must use a PK from another; any key (a.k.a. unique identifier) will do. In relational theory, a primary key has no special powers. A PK has special meaning for Jet, though i.e. the aforementioned physical ordering/clustered indexes. If you are not using your PK in terms of its special meaning to Jet, what are you using it for? Jamie. -- |
#15
|
|||
|
|||
"onedaywhen" wrote: BruceM wrote: I'm missing your point here. If I delete the record that contains autonumber ID=1 then there is no record containing ID=1. The point is SSN always identifies an entity in reality. I was referring to FullName as an Order By field in a report. As a concatenated field in a query it can't have a PK in any case. Combining FirstName and LastName as a PK would be risking duplication. what the autonumer (IDENTITY) is for, i.e. to eliminate the possibility of duplicates, so make your compound PK be (LastName, FirstName, ID) in that specific order. ID already makes it unique. That's good enough for me. I think you are placing the wrong significance on the meaning of primary key. For example, consider this: CREATE TABLE Workers ( ID INTEGER IDENTITY(1,1) NOT NULL UNIQUE, last_name VARCHAR(35) NOT NULL, first_name VARCHAR(35) NOT NULL, PRIMARY KEY (last_name, first_name, ID) ) ; CREATE TABLE Wages ( ID INTEGER NOT NULL REFERENCES Workers (ID) ON UPDATE CASCADE ON DELETE CASCADE, start_date DATETIME NOT NULL, end_date DATETIME, wage CURRENCY NOT NULL, PRIMARY KEY (start_date, ID) ) ; In other words, there is no rule (in either Jet or relational theory) that says a FK in one table must use a PK from another; any key (a.k.a. unique identifier) will do. In relational theory, a primary key has no special powers. A PK has special meaning for Jet, though i.e. the aforementioned physical ordering/clustered indexes. What if I want to order the information in some other way. If there is validity to your statements about physical order on the disk it seems to me it would only apply to one way of looking at the data (sorted alphabetically by name). What happens if I want to sort by age or by gender, or whatever? I use a PK because it is the established method. I am interested in using Access to meet my needs, and feel no compulsion to first reinvent the wheel. PK works, so I use it. I don't care about physical order on the disk or clustered indexes or anything of that sort. If I saw some objective benchmark data showing me that my databases are experiencing a performance hit because of my use of autonumber PKs I would rethink what I am doing, but until then I will concentrate on the end users. If my databases are suffering in terms of performance it probably has more to do with inefficient coding and things of that sort than to physical order on the hard drive. If you are not using your PK in terms of its special meaning to Jet, what are you using it for? Because it works. I really do appreciate your taking the time to reply in such detail, but I am unlikely to change my ways. Jamie. -- |
#16
|
|||
|
|||
BruceM wrote: ID already makes it unique. That's good enough for me. Let me give you a real life example I encountered just last week. I am converting a database table with FirstName and LastName columns plus the ubiquitous single column PK incrementing INTEGER autonumber column named ID. The autonumber takes care of the duplicates so there could happily be two 'Jean Dupont's in the database. However, a problem arises when data arrives via an email and a procedure named (something like) GetIDUsingName kicks in, which queries the database using LastName = 'Dupont' and FirstName = 'Jean' and takes the first row from the resulting recordset. No problem for me: the client's instruction is to do a straight port, bugs and all and the autonumber PK actually makes this bizarre function highly predictable in practice. But it makes me wonder if that duplicate-breaking autonumber is actually a solution for anything ... What if I want to order the information in some other way. If there is validity to your statements about physical order on the disk it seems to me it would only apply to one way of looking at the data (sorted alphabetically by name). What happens if I want to sort by age or by gender, or whatever? You should always use an ORDER BY. My point is, when the physical order happens to coincide with the requested ORDER BY you will get a performance gain. Obviously you can have only one physical order so you should choose wisely e.g. the order you most frequent request in your ORDER BY statements. By choosing your autonumber for the physical order, you never get the performance benefit; looked at another way, you are always taking a performance hit. ORDER BY is not the prime example, though. GROUP BY and BETWEEN are more significant. That's why I used that example of, 'grab me all the phone numbers of people whose last name begins with A'. If the required data rows are already next to each other, as it is in a paper copy telephone directory, the performance advantage is clear (remember the BETWEEN and GROUP BY are applied to the rowset early on, whereas ORDER BY is only applied at the end of the process). I know of only one way of choosing the physical order in my Jet table and that is to use the PRIMARY KEY functionality and ensure the file is regularly compacted. If I want a column that cannot be null and must contain unique values, but would not be the best choice for the physical order of the table, I can use NOT NULL with either a UNQIUE CONSTRAINT or a UNIQUE INDEX. I use a PK because it is the established method. I am interested in using Access to meet my needs, and feel no compulsion to first reinvent the wheel. PK works, so I use it. I don't care about physical order on the disk or clustered indexes or anything of that sort. If I saw some objective benchmark data showing me that my databases are experiencing a performance hit because of my use of autonumber PKs I would rethink what I am doing But you are advising others to use autonumber PKs, thus propagating bad advice. I guess my methodology is to try to find the best way of doing something. I can use logic to determine that a table in physical order of its 'prime usage' will have a performance advantage so I will go with this approach unless there is another issue e.g. doing so would make my schema more difficult to maintain. With the greatest respect, your approach seems to be more like, 'Well, it works for me and doesn't run like a dog so I won't change my ways until someone reputable tells me I should.' But thank you, as ever, for listening. Jamie. -- |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Are Primary keys needed in child tables? | Nevie and Phil | Database Design | 2 | January 16th, 2005 07:23 AM |
Logical question on primary keys... | Access rookie | Database Design | 4 | January 8th, 2005 11:26 AM |
Using Primary Keys | Jodie | General Discussion | 1 | July 14th, 2004 08:49 PM |
Choosing Primary and Foreign Keys | A.V.H | New Users | 8 | May 23rd, 2004 09:12 PM |