If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#101
|
|||
|
|||
Separate PK in Jxn Tbl?
I never seek to add columns where an attribute does not exist in the reality being modeled; sometimes I do end up adding something 'artificial' but only when there is a "good data modeling" reason for doing so.
May I add to that? When you design -- yes, actually design and not blindly throw on the first thing at hand -- an artificial key, you also have to create and document the validation and verification rules for it. I am a fan of check digits if this data element has to come in thru application code front ends. When we get SIMILAR TO in more SQL products, I will do more with regular expressions. |
#102
|
|||
|
|||
Separate PK in Jxn Tbl?
.. use of a DRI WITH ON UPDATE CASCADE .. it's another level of complexity that you must add to the design of your database; i.e., you must make sure that they are all there and no one is missing.
If the business rule was there, then you have to enforce it -- assuming that you want data integrity. The question is HOW to enforce it, not IF you are going to. DRI actions are declarative and deterministic; application code and triggers are procedural, proprietary and possible non-deterministic. Oh, did I mention that the optimizer can use DRI actions, but not application code and triggers? That only have to write the DRI once and not in every application, present and future? And that I have some graph theory to validate the chain of actions I set up, rather than doing code testing? All that would seem to make DRI both less complex and more practical than the other possible methods. Second, this DRI cannot be used with cyclic relationship with SQL-Server but with Oracle, you can. (From your example, I believe that you are working with Oracle). On SQL-Server, you must use triggers to implement such a feature when there is a cyclic relationship. Yes, SQL Server is weaker in some features. How does that invalidate the approach? And you can do some of this with really ugly updatable VIEWs that use the WITH CHECK OPTION. Triggers can look good after that. |
#103
|
|||
|
|||
Separate PK in Jxn Tbl?
« But many of them seem to write as if contents as determined by address ».
The content is not determined by the address and in fact, for those who are using surrogate keys, the exact value of an address inside the database has zero importance. To me, a table is a collection of jars, each jar has a serial number and the information that must be stored into the database - the information for which the database has been create - is put inside these jars. Each jar has its own serial number; however, the exact value of this serial number has not importance at all in regard to what is put inside the jar and if copy the information from this database to another database, I don't care to see the information to be put back in a jar with the exact same serial number. However, proponents of the use of natural keys insist to have a strong relationship between the serial number on the jar and what's in it. To me, such a relationship is pointless and I feel no necessity to use it. -- Sylvain Lafontaine, ing. MVP - Technologies Virtual-PC E-mail: sylvain aei ca (fill the blanks, no spam please) "David Cressey" wrote in message newsN%mj.4097$u_5.247@trndny09... "Marshall" wrote in message ... an attempt to make SQL data have an address Exactly. Literally hundreds of attempts to sell snake oil in c.d.t. can be reduced to precisely this statement. We've all said variations of the above, but I've never seen it put so succintly. As far as the MS Access newsgroups that this discussion is posted to, I can't speak to how well your summary extends to their mental model. But many of them seem to write as if contents as determined by address were the fundamental paradigm of data. |
#104
|
|||
|
|||
Separate PK in Jxn Tbl?
David W. Fenton wrote:
JOG wrote in m: I certainly don't think developers should excuse sloppy RDBMS design just because they are using access (and of course I'm sure many of the professionals here wouldn't dream of doing so, despite others laxness). What *are* you talking about? Any mistakes in schema design that you can make in Access, you can make in any other RDBMS. I would agree that there are many places that the wizards in Access and the sample databases encourage sub-optimal practices. But most professional developers aren't using either wizards or the sample databases as their models for developing their own applications. If your impression of Access comes from futzing with it for 10 minutes and from encountering kludged-together apps created by your company receptionist, then you just haven't a clue what Access offers, either as an application development platform or as a data store (using its native Jet engine). This is a schema question, and that is orthogonal to Access, because Access is an application development platform. If you use Jet for your data store, then Jet is relevant to the discussion. But you can use any data store that offers an ISAM or ADO or ODBC drivers, and that means all your schema issues are completely divorced from Access itself. That you can't seem to keep this distinction clear in your mentions of Access demonstrates pretty clearly that you are completely clueless about Access and really aren't in any position to be making disparaging comments about it. Nothing you have written contradicts JOG. In fact, you have agreed with him albeit vehemently and derogatively. |
#105
|
|||
|
|||
Separate PK in Jxn Tbl?
This remind me of the discussion twenty five years ago between relational
databases (like SQL-Server) and non-relational databases (like ISAM databases, dBase, etc.). It has always been right to say that for simple queries, non-relational databases are faster than relational databases. However, nowadays, non-relational databases have (practically) vanished because of their inherent slowness when the queries become more and more complicated. This is exactly the same situation with the possibility of accelerating a query by using a natural key: you are accelerating simple queries that are already light and fast but on the opposite side, you are slowing down complexe queries that are already big and slow. Not sure if going this way is really advantageous. -- Sylvain Lafontaine, ing. MVP - Technologies Virtual-PC E-mail: sylvain aei ca (fill the blanks, no spam please) "Brian Selzer" wrote in message ... wrote in message ... On Jan 27, 12:39 am, "Brian Selzer" wrote: "James A. Fortune" wrote in messagenews:% Access programmers use forms to interact with the data. If I follow Jamie's advice and constrain the data at both the table level and in code, then your points make more sense. Right now, they're just arguments for me not to constrain the data at the table level because the reasons you gave might make natural keys preferable in that situation :-). Well, that's just dumb. Checks in code can reduce database round-trips, and therefore can improve performance, but are not and cannot be a substitute for constraints on the tables. It is the constraints on the tables that keeps garbage out of the database. If the users only access the tables through forms, conforming to best practices in Access, how are they going to get garbage into the tables? Now if you're trying to keep Jamie and his Excel SQL out of your database, that's another story :-). There can be several forms that access the same table, so you would have to duplicate the code behind each form that accesses a table, or you can get garbage into the database. * Referencing an artificial key in a child table can complicates queries - and not just with a longer restrict clause, but with a whole extra join that may well have been unrequired if a natural key had been used. I don't agree with that point. The child table can contain the AutoNumber primary key from the main table as a foreign key if desired. I don't see how using the natural key fields requires less joins than that. Maybe an example would help me understand what you mean. An extra join may be needed if the natural key from the parent table is used in a restrict clause. If all you have is the artificial key from the parent table, then you have to join in order to access the natural key columns. With natural keys, the natural key values from the parent table also appear in the child table, so there isn't any need to join. Bottom line: joins of artificial keys are typically faster than joins of natural keys due to the size of the comparands, but with natural keys, fewer joins may be needed.. If you're planning on using a natural key column in the child table as part of a join then doesn't it make sense to include that field in the child table? Still waiting... A typical schema with artificial keys: Customer {CustomerKey, CustomerNo, ...} Key {CustomerKey}, Key {CustomerNo} Item {ItemKey, ItemNo, ...} Key {ItemKey}, Key {ItemNo} CI {CustomerItemKey, CustomerKey, ItemKey, CustomerItemNo} Key {CustomerItemKey}, Key {CustomerKey, ItemKey} CI[ItemKey] IN Item[ItemKey] CI[CustomerKey] IN Customer[CustomerKey] SOLine {SOLineKey, SOKey, SOLineNo, CustomerItemKey, Quantity, Price} Key {SOLineKey}, Key {SOKey, SOLineNo} SOLine[CustomerItemKey] IN CI[CustomerItemKey] A typical schema with natural keys Customer {CustomerNo, ...} Key {CustomerNo} Item {ItemNo, ...} Key {ItemNo} CI {CustomerNo, ItemNo, CustomerItemNo} KEY {CustomerNo, ItemNo} CI[CustomerNo] IN Customer[CustomerNo] CI[ItemNo] IN Item[ItemNo] SOLine {SO#, SOLineNo, CustomerNo, ItemNo, Quantity, Price} SOLine[CustomerNo, ItemNo] IN CI[CustomerNo, ItemNo] Now write a query that returns how many of item '12345' were sold to customer '4321' It should be obvious that with the natural keys, no joins are necessary--it's just a simple select from SOLine since all of the information is actually /in/ SOLine; whereas, with the artifical keys, several joins are required because in order to query by item number and customer number, SOLine must be joined to CI which must then be joined to Customer and Item. |
#106
|
|||
|
|||
Separate PK in Jxn Tbl?
David W. Fenton wrote:
wrote in m: If the users only access the tables through forms, conforming to best practices in Access, how are they going to get garbage into the tables? What if there's more than one application built on top of the database? I, too, agree that one should put as much of the data logic in the back end as possible. However, that doesn't mean I use natural keys very often. I'm definitely opposed to compound keys for any table whose PK will be a foreign key in another table. It causes myriad problems of all sorts (been there, done that), and despite its being theoretically correct, just doesn't work well in practice. Reference is one issue that increases the tradeoff importance of simplicity relative to the other design criteria. Just consider one scenario: You need to build criteria for a query-by-form interface. That means that to query on the PK of a table with a compound PK, you end up needing to have multiple fields in your WHERE clause. And if you're querying multiple records in the table with the compound PK, you'll need a complex nested OR in your WHERE clause. I know perfectly well that theoretically speaking you're not supposed to let your application drive the design of your schema, but this is a case where common sense tells me that following theory leads to enormously difficult application logic problems. I have yet to see any evidence from you to suggest you know the first thing about theory. Perhaps you should strive to learn a little more about it before blathering on about it. Natural keys are great for tables with a single-column natural PK. A natural key is neither more nor less than a familiar surrogate. Otherwise, surrogate keys make building an application substantially easier. I disagree. Since all keys are fundamentally the same thing, surrogacy is irrelevant. The design criteria for keys a uniqueness, irreducibility, simplicity, stability and familiarity (in no particular order.) And, BTW, I would, of course, advocate that any natural key that is not used as the PK should naturally have a unique index on it. You confuse physical and logical issues. One should declare all logical constraints regardless of the indexes used. And any natural key that can't have a unique index (because some fields need to be Null) was never a candidate for PK in the first place, and would have to have had uniqueness enforced in the application in some fashion anyway. One would have to be an idiot to design anything allowing NULL in the first place. |
#107
|
|||
|
|||
Separate PK in Jxn Tbl?
David Cressey wrote:
"Brian Selzer" wrote in message . .. Well, that's just dumb. Checks in code can reduce database round-trips, and therefore can improve performance, but are not and cannot be a substitute for constraints on the tables. It is the constraints on the tables that keeps garbage out of the database. The idea of keeping garbage out of the database takes on an entirely different meaning if you are dealing with hundreds of programs written in COBOL, Java, or anything in between accessing a single Oracle database on the one hand. On the other hand, if you are a developer creating a self contained MS Access database cum application (tables, queries, forms, reports, modules, etc.) all in one file, the same issues arise, but they are resolved quite differently. I'm not saying either one is "right" or "wrong". I'm just suggesting why an objection that makes perfect sense to you and me might be lost on the MS Access community. I am only speaking for myself. I may be the only Access programmer on the planet who validates input the way I do in code. I am not unaware of the desirability of enforcing constraints at the table level. My reluctance to depend on error trapping is shared by very few, if any, Access developers. Those who judge my unorthodox style choices as ignorant or incorrect presume much. For example, in "Joe Celko's SQL PROGRAMMING STYLE: Q: Couldn't a natural compound key become very long? A1: So what? This is the 21st century, and we have much better computers than we did in the 1950s when key size was a real physical issue. What is funny to me is the number of idiots who replace a natural two- or three- integer compound key with a huge GUID, which no human being or other system can possibly understand, because they think it will be faster and easy to program. .... The auto-numbering features are a holdover from the early SQLs, which were based on contiguous storage file systems. The data was kept in physically contiguous disk pages, in physically contiguous rows, made up of physically contiguous columns. In short, just like a deck of punchcards or a magnetic tape. Most programmers still carry that mental model, too. The most pathetic part is that many here even parrot his (incorrect) reasoning about my reasoning. I submit that they, not I, are succumbing to subtle psychological effects. James A. Fortune |
#108
|
|||
|
|||
Separate PK in Jxn Tbl?
"Sylvain Lafontaine" sylvain aei ca (fill the blanks, no spam please) wrote in message ... This remind me of the discussion twenty five years ago between relational databases (like SQL-Server) and non-relational databases (like ISAM databases, dBase, etc.). It has always been right to say that for simple queries, non-relational databases are faster than relational databases. However, nowadays, non-relational databases have (practically) vanished because of their inherent slowness when the queries become more and more complicated. I don't believe the above is correct. It's still the case that a graph based DBMS like IMS can otperform an SQL based DBMS for queries that were anticipated at the time the graph was designed. This is not a function of the complexity of the query. It's a function of the relationship between the path the query must take through the data and the paths that are made blindingly fast by the graph set up at stata store time. Essentially, an index gives and SQL engine exactly what it needs to locate table rows: pointers to table rows. Where the SQL DBMS (and the databases built using it) are superior to a graph based DBMS (and the databases built using it) is flexibility. When an unanticipated query occurs in a graph based system, the DBMS is reduced to doing a brute force search (if it can even do that much). Either that or the database designer must redesign the database, unload and reload all the data, and revise any application programs that were dependent on the old structure. That's usually prohibitively expensive. By contrast the SQL DBMS can generally build an in memory index when needed, or a new index can be built by the database designer without unloading any data or requiring any revisions to programs. This is exactly the same situation with the possibility of accelerating a query by using a natural key: you are accelerating simple queries that are already light and fast but on the opposite side, you are slowing down complexe queries that are already big and slow. Not sure if going this way is really advantageous. I disagree. -- Sylvain Lafontaine, ing. MVP - Technologies Virtual-PC E-mail: sylvain aei ca (fill the blanks, no spam please) "Brian Selzer" wrote in message ... wrote in message ... On Jan 27, 12:39 am, "Brian Selzer" wrote: "James A. Fortune" wrote in messagenews:% Access programmers use forms to interact with the data. If I follow Jamie's advice and constrain the data at both the table level and in code, then your points make more sense. Right now, they're just arguments for me not to constrain the data at the table level because the reasons you gave might make natural keys preferable in that situation :-). Well, that's just dumb. Checks in code can reduce database round-trips, and therefore can improve performance, but are not and cannot be a substitute for constraints on the tables. It is the constraints on the tables that keeps garbage out of the database. If the users only access the tables through forms, conforming to best practices in Access, how are they going to get garbage into the tables? Now if you're trying to keep Jamie and his Excel SQL out of your database, that's another story :-). There can be several forms that access the same table, so you would have to duplicate the code behind each form that accesses a table, or you can get garbage into the database. * Referencing an artificial key in a child table can complicates queries - and not just with a longer restrict clause, but with a whole extra join that may well have been unrequired if a natural key had been used. I don't agree with that point. The child table can contain the AutoNumber primary key from the main table as a foreign key if desired. I don't see how using the natural key fields requires less joins than that. Maybe an example would help me understand what you mean. An extra join may be needed if the natural key from the parent table is used in a restrict clause. If all you have is the artificial key from the parent table, then you have to join in order to access the natural key columns. With natural keys, the natural key values from the parent table also appear in the child table, so there isn't any need to join. Bottom line: joins of artificial keys are typically faster than joins of natural keys due to the size of the comparands, but with natural keys, fewer joins may be needed.. If you're planning on using a natural key column in the child table as part of a join then doesn't it make sense to include that field in the child table? Still waiting... A typical schema with artificial keys: Customer {CustomerKey, CustomerNo, ...} Key {CustomerKey}, Key {CustomerNo} Item {ItemKey, ItemNo, ...} Key {ItemKey}, Key {ItemNo} CI {CustomerItemKey, CustomerKey, ItemKey, CustomerItemNo} Key {CustomerItemKey}, Key {CustomerKey, ItemKey} CI[ItemKey] IN Item[ItemKey] CI[CustomerKey] IN Customer[CustomerKey] SOLine {SOLineKey, SOKey, SOLineNo, CustomerItemKey, Quantity, Price} Key {SOLineKey}, Key {SOKey, SOLineNo} SOLine[CustomerItemKey] IN CI[CustomerItemKey] A typical schema with natural keys Customer {CustomerNo, ...} Key {CustomerNo} Item {ItemNo, ...} Key {ItemNo} CI {CustomerNo, ItemNo, CustomerItemNo} KEY {CustomerNo, ItemNo} CI[CustomerNo] IN Customer[CustomerNo] CI[ItemNo] IN Item[ItemNo] SOLine {SO#, SOLineNo, CustomerNo, ItemNo, Quantity, Price} SOLine[CustomerNo, ItemNo] IN CI[CustomerNo, ItemNo] Now write a query that returns how many of item '12345' were sold to customer '4321' It should be obvious that with the natural keys, no joins are necessary--it's just a simple select from SOLine since all of the information is actually /in/ SOLine; whereas, with the artifical keys, several joins are required because in order to query by item number and customer number, SOLine must be joined to CI which must then be joined to Customer and Item. |
#109
|
|||
|
|||
Separate PK in Jxn Tbl?
"-CELKO-" wrote in message ... Whenever I've created junction tables [sic] in the past, I always made the PK of the junction table the combined pks from the two other tables.. I wish I knew where people got the term "junction table" when they mean a table that models a relationship . I guess if you are still thinking in terms PK-FK being pointer chains and structure and not relational, then you would borrow old terminology from Network Databases. And bring that mindset with you, too. It probably came from the term "junction record" in CODASYL databases, where a record can be a member of two sets, in order to enable a many-to-many connection between the owners of the record in the two sets. Much of the early (Bowdlerized) description of relational databases was written with reference back to the (at the time) more familiar CODASYL databases. |
#110
|
|||
|
|||
Separate PK in Jxn Tbl?
David Cressey wrote:
"Sylvain Lafontaine" sylvain aei ca (fill the blanks, no spam please) wrote in message ... This remind me of the discussion twenty five years ago between relational databases (like SQL-Server) and non-relational databases (like ISAM databases, dBase, etc.). It has always been right to say that for simple queries, non-relational databases are faster than relational databases. However, nowadays, non-relational databases have (practically) vanished because of their inherent slowness when the queries become more and more complicated. I don't believe the above is correct. It's still the case that a graph based DBMS like IMS can otperform an SQL based DBMS for queries that were anticipated at the time the graph was designed. This is not a function of the complexity of the query. I, of course, disagree. A graph based dbms has no performance advantage over a relational dbms because a relational dbms may use exactly the same physical structures. [snip] |
Thread Tools | |
Display Modes | |
|
|