If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
Physical is Disk!
Clustered Index can only be a Primary Key. You may need several fields to define uniqueness, several fields can make up an index and a primary key which is actually an index also as opposed to a field. Order can be anything you want whenever you want it using SQL. If you are going to sort by a specific field or combination of fields you may consider adding an index to that field or combination of fields. Indexes speed things up when sorting and analysing data, they can slow things down if you are inserting data, especially bulk updates. -- Slainte Craig Alexander Morrison Crawbridge Data (Scotland) Limited "BruceM" wrote in message ... Thank you for the explanation. It makes sense that it has to do with physical ordering in a table rather than on the disk. Having said that, I cannot discover the connection between indexes, the table's Order By property, and anything else that suggests an order within the table, on the actual order of records in the table. Order By, in particular, seems to accomplish nothing. Regarding John Doe, it may well be a name used by more than one person. How does this fit in with clustered indexes? I may need duplication in that field. Suppose I wanted to create a clustered index in an Access table. How would I do that? The term does not appear in Access Help, and discussions of the subject tend to assume the reader knows what a clustered index is and how to create one. Even if one is created, what benefits will I notice? "Craig Alexander Morrison" wrote in message ... Jet 4.0 and 3.5 (and earlier versions) cluster on the Primary Key and a Compact will keep it managed. Indeed a clue to this is the Registry entry for: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engi nes. HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engi nes both contain the setting CompactByPKey. I am not sure what would happen if you changed the above setting from 1, I expect 0 would skip the clustering - I am not sure if any other setting would be valid. SQL Server generally clusters on the Primary Key, however, you can select another index. AutoNumbers are very poor devices to truly define a unique record in the real world, You can enter the name John Doe 1,000,000 times in your database if the Primary Key is an AutoNumber and you have failed to do something to prevent the creation of 1,000,000 John Doe's. You may have 1,000,000 unique records but so what? Recommending the AutoNumber as Primary Key without pointing out the dangers, and suggesting the definition and declaration of the natural key (should one exist), is unwise. BTW A clustered index is merely a physical ordering of the records in a table in the database file. Using the true natural key (should one exist) as the primary key will ensure that all the records with a similar PK will be physically located next to each other. Using an AutoNumber (sequential order) as PK will mean the records are clustered according to their creation order. Using IDENTITY and AutoNumber as PK defeats the purpose of PK, this is not so bad in SQL Server as it allows you to choose something more sensible if you have an IDENTITY field in use as PK. -- Slainte Craig Alexander Morrison Crawbridge Data (Scotland) Limited "BruceM" wrote in message ... That you disagree with somebody does not make that person wrong. Roger has provided a wide range of assistance in this forum, and has made samples available on his web site. Based on his track record I would be inclined to follow his advice. If you are trying to convert people to the idea of using clustered indexes, a very basic discussion of what they are would be most helpful. I have taken your suggestion to look at Google groups. There is indeed a lot of discussion, but I have not yet found how I would create a clustered index if I wanted to. My databases with a few thousand records seem to work just fine. Why would I want to put extra effort into something that already works well? I know you have posted code that includes MAKE TABLE or some such, but the utility of such code is not clear. The other thing I noted in Google groups is that most of the discussion of clustered indexes seems to be in discussions about SQL server. wrote in message oups.com... Roger Carlson wrote: Autonumber fields make excellent Primary Keys. You've misunderstood what PRIMARY KEY means. An unique integer which has no meaning in respect fo the entities being modelled makes a lousy PRIMARY KEY. Google for "clustered index" in the Access groups. An autonumber is a convenient uniqueifier but unquieness for its own sake make not be such a good thing. |
#12
|
|||
|
|||
What would you do to guarantee uniqueness in a Contacts table or some such
involving names and addresses, in light of the fact that names and addresses are subject to change? SQL underlies Access queries. The design grid is a sort of SQL GUI (as I understand it). So I think you're saying that displayed order (e.g. sorted by last name) is not what you are talking about when you talk about physical order. If I understand, you are saying that the structure of the index determines the order on the disk, not the order in the table when it is viewed directly. I have a database that includes an Employees table. The primary key is the EmployeeID. With it to do over again I might have used something else, because it is at least possible that they will one day change the format of EmployeeID, which is just a sequential 4-digit number. In most cases I sort the employee names by last name. Would adding an index to that field maybe speed up some operations, even though the list is rather small (fewer than 100 current employees, along with a number of former employees)? "Craig Alexander Morrison" wrote in message ... Physical is Disk! Clustered Index can only be a Primary Key. You may need several fields to define uniqueness, several fields can make up an index and a primary key which is actually an index also as opposed to a field. Order can be anything you want whenever you want it using SQL. If you are going to sort by a specific field or combination of fields you may consider adding an index to that field or combination of fields. Indexes speed things up when sorting and analysing data, they can slow things down if you are inserting data, especially bulk updates. -- Slainte Craig Alexander Morrison Crawbridge Data (Scotland) Limited "BruceM" wrote in message ... Thank you for the explanation. It makes sense that it has to do with physical ordering in a table rather than on the disk. Having said that, I cannot discover the connection between indexes, the table's Order By property, and anything else that suggests an order within the table, on the actual order of records in the table. Order By, in particular, seems to accomplish nothing. Regarding John Doe, it may well be a name used by more than one person. How does this fit in with clustered indexes? I may need duplication in that field. Suppose I wanted to create a clustered index in an Access table. How would I do that? The term does not appear in Access Help, and discussions of the subject tend to assume the reader knows what a clustered index is and how to create one. Even if one is created, what benefits will I notice? "Craig Alexander Morrison" wrote in message ... Jet 4.0 and 3.5 (and earlier versions) cluster on the Primary Key and a Compact will keep it managed. Indeed a clue to this is the Registry entry for: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engi nes. HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engi nes both contain the setting CompactByPKey. I am not sure what would happen if you changed the above setting from 1, I expect 0 would skip the clustering - I am not sure if any other setting would be valid. SQL Server generally clusters on the Primary Key, however, you can select another index. AutoNumbers are very poor devices to truly define a unique record in the real world, You can enter the name John Doe 1,000,000 times in your database if the Primary Key is an AutoNumber and you have failed to do something to prevent the creation of 1,000,000 John Doe's. You may have 1,000,000 unique records but so what? Recommending the AutoNumber as Primary Key without pointing out the dangers, and suggesting the definition and declaration of the natural key (should one exist), is unwise. BTW A clustered index is merely a physical ordering of the records in a table in the database file. Using the true natural key (should one exist) as the primary key will ensure that all the records with a similar PK will be physically located next to each other. Using an AutoNumber (sequential order) as PK will mean the records are clustered according to their creation order. Using IDENTITY and AutoNumber as PK defeats the purpose of PK, this is not so bad in SQL Server as it allows you to choose something more sensible if you have an IDENTITY field in use as PK. -- Slainte Craig Alexander Morrison Crawbridge Data (Scotland) Limited "BruceM" wrote in message ... That you disagree with somebody does not make that person wrong. Roger has provided a wide range of assistance in this forum, and has made samples available on his web site. Based on his track record I would be inclined to follow his advice. If you are trying to convert people to the idea of using clustered indexes, a very basic discussion of what they are would be most helpful. I have taken your suggestion to look at Google groups. There is indeed a lot of discussion, but I have not yet found how I would create a clustered index if I wanted to. My databases with a few thousand records seem to work just fine. Why would I want to put extra effort into something that already works well? I know you have posted code that includes MAKE TABLE or some such, but the utility of such code is not clear. The other thing I noted in Google groups is that most of the discussion of clustered indexes seems to be in discussions about SQL server. wrote in message oups.com... Roger Carlson wrote: Autonumber fields make excellent Primary Keys. You've misunderstood what PRIMARY KEY means. An unique integer which has no meaning in respect fo the entities being modelled makes a lousy PRIMARY KEY. Google for "clustered index" in the Access groups. An autonumber is a convenient uniqueifier but unquieness for its own sake make not be such a good thing. |
#13
|
|||
|
|||
"BruceM" wrote in message ... What would you do to guarantee uniqueness in a Contacts table or some such involving names and addresses, in light of the fact that names and addresses are subject to change? SQL underlies Access queries. The design grid is a sort of SQL GUI (as I understand it). So I think you're saying that displayed order (e.g. sorted by last name) is not what you are talking about when you talk about physical order. If I understand, you are saying that the structure of the index determines the order on the disk, not the order in the table when it is viewed directly. I have a database that includes an Employees table. The primary key is the EmployeeID. With it to do over again I might have used something else, because it is at least possible that they will one day change the format of EmployeeID, which is just a sequential 4-digit number. Which is why most people use completely meaningless Autonumber fields as primary keys. Because you can't change the value, format, or anything else of a field that is currently being used as a primary key. Also the autonumber field will usually have a smaller size (on disk, no less) than a more meaningful key. Therefore, if you are using it in a relationship or relationships, the other tables will have to store less information when they are referring to that primary key of this table. So, for instance, if you had an employeeID that was an autonumber, all of the other tables that refer to your EmployeeID would have saved 11 bytes every time they had a foreign key to your employee table, and you could have stored what is now your employeeID primary key just once, for a total of just the one 15 byte storage of the employeeID string. This is the whole point of normalization. Anything that is actually used as data should just be stored once, with the smallest possible reference to it from other places that need to relate to the base data. More than likely you'll eventually have to move to an Autonumber primary key there for the above listed reasons. Most of us encounter this situation at least once, and from that point forward we use Autonumber primary keys, since fixing the problem once it has developed is much more of a pain than preventing it. Hope this clarifies; -Amy |
#14
|
|||
|
|||
Perhaps I will rephrase my question.
What are the dangers of using an autonumber field as the code for a code values? eg can the autonumber field get set to a differnet value if I have to re-load the table. If it can be guaranteed to be static then I have no problem with using it as a primary key eg for an employee id but if its not there seem to be some dangers to using it as such. Certainly a persons name can never be a primary key - too many John Smith's out there but a key on surname is useful for an ordered lookup. Bulk updates are not a good argument for not using a field(attribute) as a key as they should be performed in non-prime time to minimise impact. Normalisation is always the goal and there will always be some fields(attributes) in the table that can uniquely define a row. -- Denis "Amy Blankenship" wrote: "BruceM" wrote in message ... What would you do to guarantee uniqueness in a Contacts table or some such involving names and addresses, in light of the fact that names and addresses are subject to change? SQL underlies Access queries. The design grid is a sort of SQL GUI (as I understand it). So I think you're saying that displayed order (e.g. sorted by last name) is not what you are talking about when you talk about physical order. If I understand, you are saying that the structure of the index determines the order on the disk, not the order in the table when it is viewed directly. I have a database that includes an Employees table. The primary key is the EmployeeID. With it to do over again I might have used something else, because it is at least possible that they will one day change the format of EmployeeID, which is just a sequential 4-digit number. Which is why most people use completely meaningless Autonumber fields as primary keys. Because you can't change the value, format, or anything else of a field that is currently being used as a primary key. Also the autonumber field will usually have a smaller size (on disk, no less) than a more meaningful key. Therefore, if you are using it in a relationship or relationships, the other tables will have to store less information when they are referring to that primary key of this table. So, for instance, if you had an employeeID that was an autonumber, all of the other tables that refer to your EmployeeID would have saved 11 bytes every time they had a foreign key to your employee table, and you could have stored what is now your employeeID primary key just once, for a total of just the one 15 byte storage of the employeeID string. This is the whole point of normalization. Anything that is actually used as data should just be stored once, with the smallest possible reference to it from other places that need to relate to the base data. More than likely you'll eventually have to move to an Autonumber primary key there for the above listed reasons. Most of us encounter this situation at least once, and from that point forward we use Autonumber primary keys, since fixing the problem once it has developed is much more of a pain than preventing it. Hope this clarifies; -Amy |
#15
|
|||
|
|||
I'm not sure what you mean by "re-load" the table.
-Amy "Denis" wrote in message ... Perhaps I will rephrase my question. What are the dangers of using an autonumber field as the code for a code values? eg can the autonumber field get set to a differnet value if I have to re-load the table. If it can be guaranteed to be static then I have no problem with using it as a primary key eg for an employee id but if its not there seem to be some dangers to using it as such. Certainly a persons name can never be a primary key - too many John Smith's out there but a key on surname is useful for an ordered lookup. Bulk updates are not a good argument for not using a field(attribute) as a key as they should be performed in non-prime time to minimise impact. Normalisation is always the goal and there will always be some fields(attributes) in the table that can uniquely define a row. -- Denis "Amy Blankenship" wrote: "BruceM" wrote in message ... What would you do to guarantee uniqueness in a Contacts table or some such involving names and addresses, in light of the fact that names and addresses are subject to change? SQL underlies Access queries. The design grid is a sort of SQL GUI (as I understand it). So I think you're saying that displayed order (e.g. sorted by last name) is not what you are talking about when you talk about physical order. If I understand, you are saying that the structure of the index determines the order on the disk, not the order in the table when it is viewed directly. I have a database that includes an Employees table. The primary key is the EmployeeID. With it to do over again I might have used something else, because it is at least possible that they will one day change the format of EmployeeID, which is just a sequential 4-digit number. Which is why most people use completely meaningless Autonumber fields as primary keys. Because you can't change the value, format, or anything else of a field that is currently being used as a primary key. Also the autonumber field will usually have a smaller size (on disk, no less) than a more meaningful key. Therefore, if you are using it in a relationship or relationships, the other tables will have to store less information when they are referring to that primary key of this table. So, for instance, if you had an employeeID that was an autonumber, all of the other tables that refer to your EmployeeID would have saved 11 bytes every time they had a foreign key to your employee table, and you could have stored what is now your employeeID primary key just once, for a total of just the one 15 byte storage of the employeeID string. This is the whole point of normalization. Anything that is actually used as data should just be stored once, with the smallest possible reference to it from other places that need to relate to the base data. More than likely you'll eventually have to move to an Autonumber primary key there for the above listed reasons. Most of us encounter this situation at least once, and from that point forward we use Autonumber primary keys, since fixing the problem once it has developed is much more of a pain than preventing it. Hope this clarifies; -Amy |
#16
|
|||
|
|||
Amy Blankenship wrote: What are the dangers of using an autonumber field as the code for a code values? eg can the autonumber field get set to a differnet value if I have to re-load the table. I'm not sure what you mean by "re-load" the table. If the OP meant this ... CREATE TABLE Employees ( employee_ID COUNTER, last_name VARCHAR(35) NOT NULL, first_name VARCHAR(35) NOT NULL ); INSERT INTO Employees (last_name, first_name) VALUES ('Smith', 'John'); -- John Smith gets employee_ID = 1 DELETE FROM Employees; INSERT INTO Employees (last_name, first_name) VALUES ('Smith', 'John'); -- The same John Smith gets employee_ID = 2 .... then they are correct: an autonumber can never be a true key because the same entity gets a different key value depending on when they were entered into the system (relative order of INSERT). |
#17
|
|||
|
|||
BruceM wrote: What would you do to guarantee uniqueness in a Contacts table or some such involving names and addresses For a Contacts table, last_name, first_name and postal_address makes a fine natural key (assuming you can uniquely identify addresses g). The chances that someone with the same name living at the same address *is* the same person are very high. If they are different, then the chances of them being related, and hence being in contact with the intended person themselves, are high again. Adding an autonumber to this Contacts table is not going to help you resolve this situation. You'd have to tell them they are ContactID=1 and every time you contacted them you'd have to check their ContactID to ensure you weren't addressing their eponymous grandfather... unless they'd divulged their ContactID. Anyhow, in doing so you'd have to 'expose' the autonumber value and even the regulars who advocate autonumbers will tell you this is taboo. Keys are all about ... what's the word here? ... trust, security, etc. For a Contacts table, name and address are good enough because they consequence of getting the wrong person aren't all that bad (hey, maybe the granddad will buy your product g). Higher levels of trust/security are requires different information to be stored/issued: pin numbers, favourite question and answer, mother's maiden name, 'An email has been sent...reply or follow the link...', a personal appearance plus ID, photo ID, fingerprints, retina scan, etc. Autonumber does not help identify an entity in reality (in the data model), it can only be used internally (in the database). in light of the fact that names and addresses are subject to change? Who says a key can't change? What do you think ON UPDATE CASCADE is for? |
#18
|
|||
|
|||
BruceM wrote: Suppose I wanted to create a clustered index in an Access table. How would I do that? That's it! You've hit on the golden question. You create a clustered index by using the PRIMARY KEY declaration. There is no other way to create a clustered index in Access/Jet. If you want a non-nullable unique CONSTRAINT, you use NOT NULL UNIQUE. If you want a non-nullable unique clustered INDEX, you use PRIMARY KEY. CONSTRAINTs are all about data integrity (logical). INDEXes are all about performance (physical). The term does not appear in Access Help, and discussions of the subject tend to assume the reader knows what a clustered index is and how to create one. There is info out there but it is easy to miss. One view is that there is no 'choice' for a table's clustered index, it's either PK order (PK exists) or data/time order (no PK exists). In SQL Server, for example, you can explicitly specify NONCLUSTERED. In Access/Jet, CLUSTERED is implicit, default and compulsory i.e. comes as standard with PK every time even if you don't want it. The point is, for an autonumber you *don't* want it. Here's a couple of relevant articles you may have missed: ACC2000: Defragment and Compact Database to Improve Performance http://support.microsoft.com/default...b;en-us;209769 New Features in Microsoft Jet Version 3.0 http://support.microsoft.com/default...b;en-us;137039 Even if one is created, what benefits will I notice? What are the benefits? Improved performance, especially with queries that can take advantage of physically contiguous rows e.g. GROUP BY or BETWEEN constructs. That is assuming you've chosen the PK appropriately. Conversely, if you've chosen unwisely, e.g. you've made you autonumber column the PK, you will take a performance hit. Will you notice? There are too many factors to generalize; you must test. With a table of 100 rows, I doubt you would be able to *measure* any performance difference Regarding John Doe, it may well be a name used by more than one person. How does this fit in with clustered indexes? I may need duplication in that field. I suppose an autonumber could help you out here i.e. you only need (last_name, first_name) for you clustered index but you need to satisfy the UNIQUE attribute that PRIMARY KEY requires. Note the ordinal position of the columns in the PRIMARY KEY declaration are significant CREATE TABLE Blah ( first_name VARCHAR(35) DEFAULT '{{NK}}' NOT NULL, last_name VARCHAR(35) DEFAULT '{{NK}}' NOT NULL, .... (other columns) ... uniquifier IDENTITY (1,1) NOT NULL, PRIMARY KEY (last_name, first_name, uniquifier) ); .... However, the autonumber is usually not required because there should be a natural key i.e. attribute(s) which uniquely identify an entity. So use the existing key at the end of the PK declaration. Using an autonumber in place of (rather than in addition to) a natural key will lead to pain sooner or later. |
#19
|
|||
|
|||
Amy,
By re-loading I mean empty/recreate the table and put the data back again. Why would you do this? Perhaps recovery from corruption etc... If you back up a table with an autonumber field and there are gaps in the number sequence due to deletions what happens to the autonumber field if you restore from this backup? -- Denis "Amy Blankenship" wrote: I'm not sure what you mean by "re-load" the table. -Amy |
#20
|
|||
|
|||
Denis wrote: By re-loading I mean empty/recreate the table and put the data back again. Why would you do this? Perhaps recovery from corruption etc... If you back up a table with an autonumber field and there are gaps in the number sequence due to deletions what happens to the autonumber field if you restore from this backup? You can INSERT explicit values into an autonumber (COUNTER) column: CREATE TABLE Test ( key_col COUNTER NOT NULL, data_col INTEGER NOT NULL) ; INSERT INTO Test (key_col, data_col) VALUES (2147483647,1); INSERT INTO Test (key_col, data_col) VALUES (-2147483648,2); INSERT INTO Test (data_col) VALUES (3); So you can use an INSERT INTO...SELECT construct to reload your table using explicit values for the autonumber. |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Sorting a table by concatenating several fields in the same table | salsaguy | Running & Setting Up Queries | 3 | March 6th, 2005 08:41 PM |
Sorting a table by concatenating several fields in the same table | salsaguy | Running & Setting Up Queries | 0 | March 6th, 2005 01:33 AM |
Additional fields for form based parameter query/null fields | geeksdoitbetter | Running & Setting Up Queries | 2 | January 7th, 2005 10:05 PM |
Selecting Fields for Update | Steve Daigler | Page Layout | 4 | October 15th, 2004 02:13 PM |
My tables lost their AutoNumber fields | Bill Nicholson | Database Design | 2 | July 2nd, 2004 02:20 AM |