If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Circular relationships and limiting allowed values
Hi All,
I have a surprisingly simple, but complicated database requirement in that the database requires only 5 tables, but includes a circular reference by necessity. The problem I'm having is that I am unsure how to create the queries on which to base my input forms so that the correct data is displayed. Essentially I have a workflow process that is described in the following way: A dataset is released which is deployed to several regions, and each deployment involves several tasks. Not every region receives every dataset, but every dataset is built and deployed to at least one region. Several physical regions represented in the Region table fldRegionID - pk fldRegionName Several sets of data represented in the Dataset table fldDataID - pk fldDataName fldCycle (this represents the cycle with which the dataset is updated) Each dataset has several tasks that need to be completed before they are deployed, some of which are also dependant on the region to which they are deployed. This is represented by two tables, one (build) which contains those tasks that are completed for all updates to the dataset, and the second (deployment) which contains those tasks that are required for each regions deployment. These are represented as follows: - Build table fldVersionID - pk fldDatasetID – pk + fk fldReleaseDate fldBuildTask1 fldBuildTask2 |
#2
|
|||
|
|||
Circular relationships and limiting allowed values
A minor side point (sorry, my brain isn't wrapping around your larger
picture yet)... Your "task" table uses multiple fields (1 ... n) to ?contain/?document completion of tasks. This design works for spreadsheets, but not for relational databases. If you need to add (or subtract) a task, you'd have to modify your table structure, any queries based on it, any forms/reports based on it, any code/macros based on it ... Instead, use a single column for your Task-related item, and one/two more to relate it properly to whatever combination of dataset and region "needs" that task. One row per valid task-dataset-region combination. I'm not clear on what you are considering "circular relationships" -- could you clarify? I'm not clear what you mean when you use: pk + fk -- are you saying the field is a (part of the) primary key AND is found in another table's primary key (i.e., making it a foreign key in THIS table)? -- Regards Jeff Boyce Office/Access MVP "Rararachel" wrote in message ... Hi All, I have a surprisingly simple, but complicated database requirement in that the database requires only 5 tables, but includes a circular reference by necessity. The problem I'm having is that I am unsure how to create the queries on which to base my input forms so that the correct data is displayed. Essentially I have a workflow process that is described in the following way: A dataset is released which is deployed to several regions, and each deployment involves several tasks. Not every region receives every dataset, but every dataset is built and deployed to at least one region. Several physical regions represented in the Region table fldRegionID - pk fldRegionName Several sets of data represented in the Dataset table fldDataID - pk fldDataName fldCycle (this represents the cycle with which the dataset is updated) Each dataset has several tasks that need to be completed before they are deployed, some of which are also dependant on the region to which they are deployed. This is represented by two tables, one (build) which contains those tasks that are completed for all updates to the dataset, and the second (deployment) which contains those tasks that are required for each regions deployment. These are represented as follows: - Build table fldVersionID - pk fldDatasetID – pk + fk fldReleaseDate fldBuildTask1 fldBuildTask2 . . . fldBuildTaskn - Deployment table fldVersionID – pk + fk fldDatasetID – pk + fk fldRegionID – pk + fk fldDeployTask1 fldDeployTask2 . . . fldDeployTaskn Because each region may or may not receive an update, and the relationship between datasets and regions is m-m, a fifth, RegionDataset table is required: fldDatasetID – pk + fk fldRegionID – pk + fk This table’s values are essentially static. PHEW!! I have been through this structure several times and tried to re-jig things to avoid the circular relationships created, but none of the options that I have tried suits the requirements. Where I have a problem is when I try and pre-populate items in the deployment table. I don’t know how to set up my queries so that only valid deployment records may be added, given the values in both the Build table and the RegionDataset table. EG: RegionA receives dataset updates for Dataset1 and Dataset4 RegionB receives dataset updates for Dataset1, Dataset2 and Dataset3 This is represented in the RegionDataset table by the following: RegionA-Dataset1 RegionA-Dataset4 RegionB-Dataset1 RegionB-Dataset2 RegionB-Dataset3 Dataset1 has a version update v111, so a new build record is created with a compound primary key: Dataset1-v111 Once this entry is made, I’d like my database to be able to automatically add the only two valid entries to the Deployment table: RegionA-Dataset1-v111 RegionB-Dataset1-v111 But my queries must be incorrect, because this does not work. Thanks to anyone who has read this far! If anyone has done something similar before or can point me in the right direction it would be greatly appreciated. I have searched I-don’t-know-how-many helpfiles, newsgroups and websites, but without any luck. |
#3
|
|||
|
|||
Circular relationships and limiting allowed values
Well, it worked OK for me. Here's what I did:
For simplicity, I ignored the primary keys you defined and gave each Table its own (new) primary key, of random Autonumber type. But because such keys are usually meaningless to human beings, I also defined Lookup Queries to translate them to meaningful form. For example, my version of your [Dataset] Table looks like this: [Dataset] Table Datasheet View: fldDatasetID fldDataName fldCycle ------------ ----------- -------- 1400563518 Dataset4 0 1570848593 Dataset3 0 1632673424 Dataset2 0 1903683883 Dataset1 0 So, for foreign-key links to it, I defined the following Query: [QL_Dataset] SQL: SELECT Dataset.fldDatasetID, Dataset.fldDataName FROM Dataset ORDER BY Dataset.fldDataName; [QL_Dataset] Query Datasheet View: fldDatasetID fldDataName ------------ ----------- 1903683883 Dataset1 1632673424 Dataset2 1570848593 Dataset3 1400563518 Dataset4 It looks a lot like the first 2 fields in the Table, but you might prefer some other definition. (Just change the 2nd field in [QL_Dataset], and your datasheet views will magically change, too.) Make the 2nd field be concise and meaningful, and try to keep it unique within any data you're likely to encounter. Now we can use this in the [Build] Table, which (without lookups) looks like this: [Build] Table Datasheet View: fldBuildID fldDatasetID fldVer fldRelease fldBuild sionID Date Task1 ---------- ------------ ------ ---------- -------- 1678458326 1903683883 v111 12/9/2005 In Table Design View, select [fldDatasetID], choose the Lookup tab, and set the properties thus: Display Control = List Box Row Source = QL_Dataset Column Count = 2 Column Widths = 0;1 Then the Table will display like this (same Table, but with lookups enabled): [Build] Table Datasheet View: fldBuildID fldDatasetID fldVer fldRelease fldBuild sionID Date Task1 ---------- ------------ ------ ---------- -------- 1678458326 Dataset1 v111 12/9/2005 I also defined [QL_Region], for the [fldRegionID] foreign keys. Change the remaining foreign-key references in your Tables similarly to the one in [Build]. Since [fldVersionID] was just a text field, not a linking key, I didn't define a Lookup Query for it. (But in my own databases, I usually avoid ending field names with "ID" unless they are links.) The rest of the datasheet views in this message use lookup values; just remember that the data actually stored in the Tables are numbers matching the primary keys, and not what you see here. [Region] Table Datasheet View: fldRegionID fldRegionName ----------- ------------- -213821540 RegionB 836524519 RegionA [RegionDataset] Table Datasheet View: fldRegionDatasetID fldRegionID fldDatasetID ------------------ ----------- ------------ 421066538 RegionA Dataset4 798222153 RegionB Dataset3 1341337187 RegionB Dataset1 1379044973 RegionA Dataset1 1932269928 RegionB Dataset2 Then I defined a Query to display the new version number. [Q_NewVersions] SQL: SELECT Region.fldRegionName, Dataset.fldDataName, Build.fldVersionID FROM Region INNER JOIN ((Dataset INNER JOIN Build ON Dataset.fldDatasetID = Build.fldDatasetID) INNER JOIN RegionDataset ON Dataset.fldDatasetID = RegionDataset.fldDatasetID) ON Region.fldRegionID = RegionDataset.fldRegionID ORDER BY Region.fldRegionName, Dataset.fldDataName; [Q_NewVersions] Query Datasheet View: fldRegionName fldDataName fldVersionID ------------- ----------- ------------ RegionA Dataset1 v111 RegionB Dataset1 v111 This looked good, so I converted it to an Append Query. [Q_UpdateBuildList] SQL: INSERT INTO Deployment ( fldRegionID, fldDatasetID, fldVersionID, fldDeployTask1 ) SELECT Region.fldRegionID, Dataset.fldDatasetID, Build.fldVersionID, "New deployment" AS [Default] FROM Region INNER JOIN ((Dataset INNER JOIN Build ON Dataset.fldDatasetID = Build.fldDatasetID) INNER JOIN RegionDataset ON Dataset.fldDatasetID = RegionDataset.fldDatasetID) ON Region.fldRegionID = RegionDataset.fldRegionID ORDER BY Region.fldRegionID, Dataset.fldDatasetID; Since this is an Action Query, I won't show you its Datasheet View, but here's what it did to the previously empty [Deployment] Table: [Deployment] Table Datasheet View: fldDeploy fldVer fldData fldRegionID fldDeployTask1 mentID sionID setID ---------- ------ -------- ----------- -------------- 138997748 v111 Dataset1 RegionA New deployment 1997021855 v111 Dataset1 RegionB New deployment .... and this is what I think you said you wanted. Incidentally (and I agree with Jeff Boyce on this), all those [fldDeployTask n] fields give me the impression of containing the same sorts of data. (They certainly have similar enough names!) I suggest that you split them all out into another Table, in which each record contains (at least) the following fields: - primary key to identify this record - which task it is (1..n) - value copied from the [fldDeployTask n] field - foreign key to [Deployment] or [Build], wherever the datum came from If you wish, you can combine the fields from [Deployment] and [Build] into one Table, but then you'd need a way to distinguish them, either via an additional field, or by using different task number series (100-199 for [Deployment], 200-299 for [Build], for example). If you do this, it will create an extra Table or two, but you'll save yourself lots of headaches in analyzing & summarizing your data. -- Vincent Johns Please feel free to quote anything I say here. Rararachel wrote: Hi All, I have a surprisingly simple, but complicated database requirement in that the database requires only 5 tables, but includes a circular reference by necessity. The problem I'm having is that I am unsure how to create the queries on which to base my input forms so that the correct data is displayed. Essentially I have a workflow process that is described in the following way: A dataset is released which is deployed to several regions, and each deployment involves several tasks. Not every region receives every dataset, but every dataset is built and deployed to at least one region. Several physical regions represented in the Region table fldRegionID - pk fldRegionName Several sets of data represented in the Dataset table fldDataID - pk fldDataName fldCycle (this represents the cycle with which the dataset is updated) Each dataset has several tasks that need to be completed before they are deployed, some of which are also dependant on the region to which they are deployed. This is represented by two tables, one (build) which contains those tasks that are completed for all updates to the dataset, and the second (deployment) which contains those tasks that are required for each regions deployment. These are represented as follows: - Build table fldVersionID - pk fldDatasetID – pk + fk fldReleaseDate fldBuildTask1 fldBuildTask2 . . . fldBuildTaskn - Deployment table fldVersionID – pk + fk fldDatasetID – pk + fk fldRegionID – pk + fk fldDeployTask1 fldDeployTask2 . . . fldDeployTaskn Because each region may or may not receive an update, and the relationship between datasets and regions is m-m, a fifth, RegionDataset table is required: fldDatasetID – pk + fk fldRegionID – pk + fk This table’s values are essentially static. PHEW!! I have been through this structure several times and tried to re-jig things to avoid the circular relationships created, but none of the options that I have tried suits the requirements. Where I have a problem is when I try and pre-populate items in the deployment table. I don’t know how to set up my queries so that only valid deployment records may be added, given the values in both the Build table and the RegionDataset table. EG: RegionA receives dataset updates for Dataset1 and Dataset4 RegionB receives dataset updates for Dataset1, Dataset2 and Dataset3 This is represented in the RegionDataset table by the following: RegionA-Dataset1 RegionA-Dataset4 RegionB-Dataset1 RegionB-Dataset2 RegionB-Dataset3 Dataset1 has a version update v111, so a new build record is created with a compound primary key: Dataset1-v111 Once this entry is made, I’d like my database to be able to automatically add the only two valid entries to the Deployment table: RegionA-Dataset1-v111 RegionB-Dataset1-v111 But my queries must be incorrect, because this does not work. Thanks to anyone who has read this far! If anyone has done something similar before or can point me in the right direction it would be greatly appreciated. I have searched I-don’t-know-how-many helpfiles, newsgroups and websites, but without any luck. |
#4
|
|||
|
|||
Circular relationships and limiting allowed values
"=?Utf-8?B?UmFyYXJhY2hlbA==?="
wrote in : Essentially I have a workflow process that is described in the following way: A dataset is released which is deployed to several regions, and each deployment involves several tasks. Not every region receives every dataset, but every dataset is built and deployed to at least one region. No problem he it's a bog standard many to many relationship: Datasets -(isPartOf)- Deployments -(isSentTo)- Regions Each dataset has several tasks that need to be completed before they are deployed, You already said that: see above Tasks -(toBeCompletedFor)- Deployments Each dataset has several tasks that need to be completed before they are deployed, some of which are also dependant on the region to which they are deployed. The kicker here is the "...some of which...". What is special about dependent Tasks compared to independent Tasks? How are independent Tasks allocated - to all Regions that the Dataset is going to, or none? Are they the same thing, or do they need separate tables? One possibility would be DependentTasks -(toBeCompletedFor)- Deployments IndependentTasks -(belongTo)- Datasets I have been through this structure several times and tried to re-jig things to avoid the circular relationships created, but none of the options that I have tried suits the requirements. Who on earth cares about circular relationships? Who told you that they are bad things? There are loads of examples where two entities have more than one relationship between them. Pah. In your case, I think you haven't done sufficient analysis: at least, you have not presented it here. What exactly is the connection between Deploying To A Region and Having A Task Related To That Deployment? In other words, do some Regions have a Deployment but no Task to complete on it? How are non-tasked Deployments different from Tasked Deployments? In other words, would this work: Deployments -(require)- Tasks -(by)- Regions or would you lose some deployments-to-region because there are no tasks for that region to complete? I don't know how to set up my queries so that only valid deployment records may be added, given the values in both the Build table and the RegionDataset table. You manage integrity using Constraints (i.e. ValidationRules and Relationships) not queries! RegionA receives dataset updates for Dataset1 and Dataset4 RegionB receives dataset updates for Dataset1, Dataset2 and Dataset3 This is represented in the RegionDataset table by the following: Okay, it's what I called the Deployments table RegionA,Dataset1 RegionA,Dataset4 RegionB,Dataset1 RegionB,Dataset2 RegionB,Dataset3 Dataset1 has a version update v111, so a new build record is created with a compound primary key: You never said anything about VersionUpdates -- what are they? Do they have Tasks and Deployments too? Dataset1-v111 Once this entry is made, I would like my database to be able to automatically add the only two valid entries to the Deployment table: No you don't. Any "automatic adding" practically always means a Major Design Problem. But we already know that g |
#5
|
|||
|
|||
Circular relationships and limiting allowed values
Tim Ferguson wrote:
"=?Utf-8?B?UmFyYXJhY2hlbA==?=" wrote in : [...] Dataset1-v111 Once this entry is made, I would like my database to be able to automatically add the only two valid entries to the Deployment table: No you don't. Any "automatic adding" practically always means a Major Design Problem. But we already know that g Tim makes a good point here (I'm sorry I missed it; maybe I was thinking that someone had asked you to produce a Table). Since you already have the information available that you claim you need to stuff into your [Deployment] Table, you probably should be using a [Deployment] Query instead, one that returns these same results but doesn't clutter your database with data that might go out of date behind your back. The [Q_NewVersions] Query in my earlier message, or something like it, will do that; just give it a new name if you need to. -- Vincent Johns Please feel free to quote anything I say here. |
#6
|
|||
|
|||
Circular relationships and limiting allowed values
Hiya,
Firstly, thanks so much for taking the time to a) read and b) reply to my post -particularly a) since it probably took the better part of half an hour to read that epic! In response to your reply, I think I need to clarify a couple of things that I didn't make particularly clear... Your "task" table uses multiple fields (1 ... n) to ?contain/?document completion of tasks. This design works for spreadsheets, but not for relational databases. If you need to add (or subtract) a task, you'd have to modify your table structure, any queries based on it, any forms/reports based on it, any code/macros based on it ... I left out info regarding each task for simplicity, but realise that this made it more confusing. Each of the fields that I have labelled tasks stores a date representing when a particular action was undertaken. When a dataset is released, it needs to be packaged and replicated (amongst other things), and this has to be performed for each build (ie: each version update). In addition, when the updated datasets are deployed, there are tasks that need to be completed for each region that receives the updates (eg: create the script for rolling out the update, confirm that the users have successfully received the update - again, the stored values are dates). I think this is where I made the least sense. Every version release for a dataset will always have every task completed that belongs in the build table. Any region that receives an update will always have every deployment task completed. These tasks are extremely unlikely to change, and will involve a process rethink anyway if they do. I'm not sure that pulling the tasks out of each of the Build and Deployment tables is necessary, each task will have a one-one relationship with the records in these tables anyway, and all are required (at some stage) - they are all just attributes of the records within these tables. I'm not clear on what you are considering "circular relationships" -- could you clarify? What I meant (and apologies once again for confusion, I just didn't know how else to describe this one), is that the dataset table had relationships with both the Build and RegionDataset tables and the deployment table also had relationships with these two tables, which looks like a circle in the relationship viewer in Access) I'm not clear what you mean when you use: pk + fk -- are you saying the field is a (part of the) primary key AND is found in another table's primary key (i.e., making it a foreign key in THIS table)? That's exactly right I hope this makes sense. Thanks again for the feedback. Cheers! |
#7
|
|||
|
|||
Circular relationships and limiting allowed values
"Vincent Johns" wrote: .... and this is what I think you said you wanted. Thanks Vincent, I will give this a go now. I thought I had to build this into the table design rather than just entering the records when I added each build. Incidentally (and I agree with Jeff Boyce on this), all those [fldDeployTask n] fields give me the impression of containing the same sorts of data. I realise I went the right way about confusing everyone with the description of this, and apologies for that. I've explained it again in my reply to Jeff, but I'll add it here for easy access: "Each of the fields that I have labelled 'tasks' stores a date representing when a particular action was undertaken. When a dataset is released, it needs to be packaged and replicated (amongst other things), and this has to be performed for each build (ie: each version update). In addition, when the updated datasets are deployed, there are tasks that need to be completed for each region that receives the updates (eg: create the script for rolling out the update, confirm that the users have successfully received the update - again, the stored values are dates). I think this is where I made the least sense. Every version release for a dataset will always have every task completed that belongs in the build table. Any region that receives an update will always have every deployment task completed. These tasks are extremely unlikely to change, and will involve a process rethink anyway if they do. I'm not sure that pulling the tasks out of each of the Build and Deployment tables is necessary, each task will have a one-one relationship with the records in these tables anyway, and all are required (at some stage) - they are all just attributes of the records within these tables." I'm still sure this is valid, but I'm willing to accept a slap with a wet dishcloth if you think I'm just not getting it! Thanks again for your feedback, I'll give your suggestions a go, and send an update on my progress... Cheers! |
#8
|
|||
|
|||
Circular relationships and limiting allowed values
Hiya Tim, Thanks for your reply. I appreciate this a was somewhat confusing post, but I've tried to clarify this in another post which I'll add in below... The kicker here is the "...some of which...". What is special about dependent Tasks compared to independent Tasks? How are independent Tasks allocated - to all Regions that the Dataset is going to, or none? Are they the same thing, or do they need separate tables? I re-read my post after your comments, and I was particularly bad with this section. Some of the tasks do vary a bit between the regions they are deployed to, but not really enough to be reflected in the database - this was an especially unhelpful piece of info - apologies. As per previous posts to Jeff and Vincent: "Each of the fields that I have labelled 'tasks' stores a date representing when a particular action was undertaken. When a dataset is released, it needs to be packaged and replicated (amongst other things), and this has to be performed for each build (ie: each version update). In addition, when the updated datasets are deployed, there are tasks that need to be completed for each region that receives the updates (eg: create the script for rolling out the update, confirm that the users have successfully received the update - again, the stored values are dates). I think this is where I made the least sense. Every version release for a dataset will always have every task completed that belongs in the build table. Any region that receives an update will always have every deployment task completed. These tasks are extremely unlikely to change, and will involve a process rethink anyway if they do. I'm not sure that pulling the tasks out of each of the Build and Deployment tables is necessary, each task will have a one-one relationship with the records in these tables anyway, and all are required (at some stage) - they are all just attributes of the records within these tables." You never said anything about VersionUpdates -- what are they? Do they have Tasks and Deployments too? Version updates are builds essentially, well that is to say that every version update has a build (but also several deployments) No you don't. Any "automatic adding" practically always means a Major Design Problem. But we already know that g Have taken this as read, and will go with Vincent's append query suggestion on this one I think. Thanks heaps for the time you've taken to reply. I'll send an update...soon! Cheers |
#9
|
|||
|
|||
Circular relationships and limiting allowed values
Rararachel wrote:
Hiya Tim, [...] I'm not sure that pulling the tasks out of each of the Build and Deployment tables is necessary, each task will have a one-one relationship with the records in these tables anyway, and all are required (at some stage) - they are all just attributes of the records within these tables." I think it's kinda immaterial -- the point here was that, any time you have a bunch of fields that are similar to some OTHER bunch of fields in the same record (even if they mean something different), they are prime candidates for moving out into a separate Table that contains just that bunch. Often you'll find the fields next to each other in your Table design, because you probably thought of them at about the same time (e.g., someone's surname and the same person's phone #). Doing this will probably spoil your 1:1 relationship, as you'll now have 2 or 3 sets of fields (that you have moved to the other Table) that are linked to the original record, but there's nothing magical about a 1:1 relationship. Actually, though sometimes useful, they are kind of rare. As mentioned elsewhere, if you do pull out fields to a separate Table, you might need to add a field or two (in the new Table) to identify what they mean, such as a field to tell you if the physical region in this record came from the first such set in your original record or the third such set (if you care about that), or maybe a field to tell you which original Table the record came from (since you might collect them from more than one), or maybe a field to serve as a primary key (e.g., Autonumber type) in the new Table. Your database will still work if you do not do this, but it will possibly be unwieldy, and certain types of Queries will be very difficult to define. (Enough so, so that you may decide that you just can't get that type of information from your database.) So I think, before you add too many data to your Tables, that you might at least consider splitting out the repeating groups into a separate Table. And just in case you are worried about losing the ability to display all those fields in one record, you can always write Queries that will do that, even after you have split some of the fields out into a related Table. -- Vincent Johns Please feel free to quote anything I say here. |
#10
|
|||
|
|||
Circular relationships and limiting allowed values
"=?Utf-8?B?UmFyYXJhY2hlbA==?="
wrote in : What I meant (and apologies once again for confusion, I just didn't know how else to describe this one), is that the dataset table had relationships with both the Build and RegionDataset tables and the deployment table also had relationships with these two tables, which looks like a circle in the relationship viewer in Access) This is not neccessarily a problem: it all depends on the semantics. Relationships can look like loops in a diagram for a number of reasons: Self references: direct ones ] Employees.Supervisor references Employees.EmpID indirect Employees.DeptCode references Departments, Departments.Manager references Employees.EmpID Multiple references: Students.HomeAddress references Addresses Students.TermTimeAddress references Addresses Prescription.PatientID references Patients Patients.RegisteredGP references Practitioners Prescription.SignedBy references Practitioners The last one is the most interesting. It allows you to ask, how many prescriptions have been signed by someone other than the patient's usual GP, and so on. This is the closest to your example, and it reflects the fact that the two (chains of) relationships are semantically different. Truly redundant: Items(ID, MadeBy) references Products Products.MadeBy references Manufacturers Items.MadeBy references Manufacturers In this one, the third constraint is already implicit in the first two: you cannot remove a Manufacturer record where there is an exisiting Product, and you can't remove the Product record if there are existing Item records. It's not actually harmful, but it takes up computer resources unneccessarily. Hope that helps Tim F |
Thread Tools | |
Display Modes | |
|
|