If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
How to work with 'multiple response sets' in Access
I have (in an Access 2003 database) a series of fields of the yes/no data
type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#2
|
|||
|
|||
How to work with 'multiple response sets' in Access
What you are asking for is called a "repeating group" in Relational database
speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#3
|
|||
|
|||
How to work with 'multiple response sets' in Access
Pat, thanks for trying to help me. I realise that I was actually asking
several questions in one. You were correct in pointing out that my questions were based on an example that does not comply with basic rules for the design of a relational database. So, let's start with that. My question is: what is the big advantage of this set-up in a case where the data are relatively static? Let’s, for the sake of argument, presume I have a questionnaire, composed of a series of questions, each allowing 1 answer, to be chosen among a group of predefined answers. The way to store the information for statistical analysis in e.g. Excel, would be in one table, in which every line contains the answers to one questionnaire and every column the answers to one question. In Access, however, I understand that the correct way to store the same information would be to have one table in which each line contains one questionnaire’s answer to one question. Now let’s say (example 1) my questionnaire has 60 questions (not unusual) and that I have 6000 questionnaires (also not an unusual case). That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way? A slightly different situation occurs (example 2) when all of my 60 questions only allow for a “yes” or a “no” answer. In that case I could put into my table only the “yes”-answers, leaving out the “no”-answers. If, on average, every questionnaire had 10 “yes”-answers, that would reduce my table to 60,000 lines, still a bit bulky. Now let’s consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered “yes”. In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines. But I could use this same idea in example 2. Instead of one line stating which of the answers received an affirmative answers, I would several. Again, if the average number of affirmative answers per questionnaire were 10, my table would be back to 60,000 lines. The data I work with are not really from questionnaires, but they are comparable in that the structure is very unlikely to change much over time. The amount of data, however, is going to grow quickly. Why should I use a relational database structure and if I need to, which of the solutions I tried to describe above is the best? Thanks again for your willingness to help. "Pat Hartman (MVP)" wrote: What you are asking for is called a "repeating group" in Relational database speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#4
|
|||
|
|||
How to work with 'multiple response sets' in Access
"That would mean that my table would have 360,000 lines! That does not seem
very practical to me. Can you explain me what the big advantage is of storing the information this way?" The advantage is that you can analyze it easily. To work with the flattened structure, you'll need a query for each separate column to analyze the data so the more columns you have, the more queries you'll need. Not to mention the nightmare of adding additional columns for future questionnaires. "Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines." If only one question is answered in the first scenario, you would have only 6000 rows. The extra advantage of the normal structure is only populated rows need to exist. There is no question in my mind that rows wins over columns. You will just have to try building queries based on both structures to convince yourself. Don't forget that relational databases do not support functions that work "across" the columns way that Excel does. In a relational query, functions operate on a domain which is defined by the number of rows selected in a query/table. If you wanted to determine which of the 60 columns contained the "yes", you'd need to interrogate the 60 columns specifically by name. With the normalized structure, you just have criteria that returns all the rows for a particular survey - one criteria verses 60 nested if statements. Give me the rows!!!! If your questionnaire is sparse (meaning that not all questions need to be answered), the normalized structure could likely take less overall space than the flat structure even though it takes considerably more rows. Access is capable of handling millions of rows efficiently. "Wim" wrote in message ... Pat, thanks for trying to help me. I realise that I was actually asking several questions in one. You were correct in pointing out that my questions were based on an example that does not comply with basic rules for the design of a relational database. So, let's start with that. My question is: what is the big advantage of this set-up in a case where the data are relatively static? Let's, for the sake of argument, presume I have a questionnaire, composed of a series of questions, each allowing 1 answer, to be chosen among a group of predefined answers. The way to store the information for statistical analysis in e.g. Excel, would be in one table, in which every line contains the answers to one questionnaire and every column the answers to one question. In Access, however, I understand that the correct way to store the same information would be to have one table in which each line contains one questionnaire's answer to one question. Now let's say (example 1) my questionnaire has 60 questions (not unusual) and that I have 6000 questionnaires (also not an unusual case). That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way? A slightly different situation occurs (example 2) when all of my 60 questions only allow for a "yes" or a "no" answer. In that case I could put into my table only the "yes"-answers, leaving out the "no"-answers. If, on average, every questionnaire had 10 "yes"-answers, that would reduce my table to 60,000 lines, still a bit bulky. Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines. But I could use this same idea in example 2. Instead of one line stating which of the answers received an affirmative answers, I would several. Again, if the average number of affirmative answers per questionnaire were 10, my table would be back to 60,000 lines. The data I work with are not really from questionnaires, but they are comparable in that the structure is very unlikely to change much over time. The amount of data, however, is going to grow quickly. Why should I use a relational database structure and if I need to, which of the solutions I tried to describe above is the best? Thanks again for your willingness to help. "Pat Hartman (MVP)" wrote: What you are asking for is called a "repeating group" in Relational database speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#5
|
|||
|
|||
How to work with 'multiple response sets' in Access
Wim, you might also benefit from studying the often-recommended sample
survey database created by MVP Duane Hookom, at http://www.rogersaccesslibrary.com/O...p#Hookom,Duane. hth "Wim" wrote in message ... Pat, thanks for trying to help me. I realise that I was actually asking several questions in one. You were correct in pointing out that my questions were based on an example that does not comply with basic rules for the design of a relational database. So, let's start with that. My question is: what is the big advantage of this set-up in a case where the data are relatively static? Let's, for the sake of argument, presume I have a questionnaire, composed of a series of questions, each allowing 1 answer, to be chosen among a group of predefined answers. The way to store the information for statistical analysis in e.g. Excel, would be in one table, in which every line contains the answers to one questionnaire and every column the answers to one question. In Access, however, I understand that the correct way to store the same information would be to have one table in which each line contains one questionnaire's answer to one question. Now let's say (example 1) my questionnaire has 60 questions (not unusual) and that I have 6000 questionnaires (also not an unusual case). That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way? A slightly different situation occurs (example 2) when all of my 60 questions only allow for a "yes" or a "no" answer. In that case I could put into my table only the "yes"-answers, leaving out the "no"-answers. If, on average, every questionnaire had 10 "yes"-answers, that would reduce my table to 60,000 lines, still a bit bulky. Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines. But I could use this same idea in example 2. Instead of one line stating which of the answers received an affirmative answers, I would several. Again, if the average number of affirmative answers per questionnaire were 10, my table would be back to 60,000 lines. The data I work with are not really from questionnaires, but they are comparable in that the structure is very unlikely to change much over time. The amount of data, however, is going to grow quickly. Why should I use a relational database structure and if I need to, which of the solutions I tried to describe above is the best? Thanks again for your willingness to help. "Pat Hartman (MVP)" wrote: What you are asking for is called a "repeating group" in Relational database speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#6
|
|||
|
|||
How to work with 'multiple response sets' in Access
Thanks again, Pat, for answering my question. But I have to say I’m not fully
convinced yet. As for the first part of your answer, I’d say it just shows that for analysing data, Access simply isn’t a good choice. Fortunately it’s easy to exchange data between Access and Excel. But imagine I have two columns with numerical data and I want to calculate the correlation between them; that would be a lot easier in the “flattened structure” than in what you call a “normalized” structure. Also, I do not understand why you consider adding columns such a “nightmare”. It doesn’t seem that complicated to me. And I wouldn’t have to do it very often. But let’s get to my second question. Suppose I have my questionnaire data (several hundreds of records) in a flattened table, 60 columns long, all of the yes/no data type, and I want to transform them into a normalized table. Is there a way to do this, without having to type in everything for a second time? I suppose this has to be done by means of a query, but I just can’t figure out how. Can you help? "Pat Hartman (MVP)" wrote: "That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way?" The advantage is that you can analyze it easily. To work with the flattened structure, you'll need a query for each separate column to analyze the data so the more columns you have, the more queries you'll need. Not to mention the nightmare of adding additional columns for future questionnaires. "Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines." If only one question is answered in the first scenario, you would have only 6000 rows. The extra advantage of the normal structure is only populated rows need to exist. There is no question in my mind that rows wins over columns. You will just have to try building queries based on both structures to convince yourself. Don't forget that relational databases do not support functions that work "across" the columns way that Excel does. In a relational query, functions operate on a domain which is defined by the number of rows selected in a query/table. If you wanted to determine which of the 60 columns contained the "yes", you'd need to interrogate the 60 columns specifically by name. With the normalized structure, you just have criteria that returns all the rows for a particular survey - one criteria verses 60 nested if statements. Give me the rows!!!! If your questionnaire is sparse (meaning that not all questions need to be answered), the normalized structure could likely take less overall space than the flat structure even though it takes considerably more rows. Access is capable of handling millions of rows efficiently. "Wim" wrote in message ... Pat, thanks for trying to help me. I realise that I was actually asking several questions in one. You were correct in pointing out that my questions were based on an example that does not comply with basic rules for the design of a relational database. So, let's start with that. My question is: what is the big advantage of this set-up in a case where the data are relatively static? Let's, for the sake of argument, presume I have a questionnaire, composed of a series of questions, each allowing 1 answer, to be chosen among a group of predefined answers. The way to store the information for statistical analysis in e.g. Excel, would be in one table, in which every line contains the answers to one questionnaire and every column the answers to one question. In Access, however, I understand that the correct way to store the same information would be to have one table in which each line contains one questionnaire's answer to one question. Now let's say (example 1) my questionnaire has 60 questions (not unusual) and that I have 6000 questionnaires (also not an unusual case). That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way? A slightly different situation occurs (example 2) when all of my 60 questions only allow for a "yes" or a "no" answer. In that case I could put into my table only the "yes"-answers, leaving out the "no"-answers. If, on average, every questionnaire had 10 "yes"-answers, that would reduce my table to 60,000 lines, still a bit bulky. Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines. But I could use this same idea in example 2. Instead of one line stating which of the answers received an affirmative answers, I would several. Again, if the average number of affirmative answers per questionnaire were 10, my table would be back to 60,000 lines. The data I work with are not really from questionnaires, but they are comparable in that the structure is very unlikely to change much over time. The amount of data, however, is going to grow quickly. Why should I use a relational database structure and if I need to, which of the solutions I tried to describe above is the best? Thanks again for your willingness to help. "Pat Hartman (MVP)" wrote: What you are asking for is called a "repeating group" in Relational database speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#7
|
|||
|
|||
How to work with 'multiple response sets' in Access
Thanks for the tip, tina, I will have a good look at it.
"tina" wrote: Wim, you might also benefit from studying the often-recommended sample survey database created by MVP Duane Hookom, at http://www.rogersaccesslibrary.com/O...p#Hookom,Duane. hth "Wim" wrote in message ... Pat, thanks for trying to help me. I realise that I was actually asking several questions in one. You were correct in pointing out that my questions were based on an example that does not comply with basic rules for the design of a relational database. So, let's start with that. My question is: what is the big advantage of this set-up in a case where the data are relatively static? Let's, for the sake of argument, presume I have a questionnaire, composed of a series of questions, each allowing 1 answer, to be chosen among a group of predefined answers. The way to store the information for statistical analysis in e.g. Excel, would be in one table, in which every line contains the answers to one questionnaire and every column the answers to one question. In Access, however, I understand that the correct way to store the same information would be to have one table in which each line contains one questionnaire's answer to one question. Now let's say (example 1) my questionnaire has 60 questions (not unusual) and that I have 6000 questionnaires (also not an unusual case). That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way? A slightly different situation occurs (example 2) when all of my 60 questions only allow for a "yes" or a "no" answer. In that case I could put into my table only the "yes"-answers, leaving out the "no"-answers. If, on average, every questionnaire had 10 "yes"-answers, that would reduce my table to 60,000 lines, still a bit bulky. Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines. But I could use this same idea in example 2. Instead of one line stating which of the answers received an affirmative answers, I would several. Again, if the average number of affirmative answers per questionnaire were 10, my table would be back to 60,000 lines. The data I work with are not really from questionnaires, but they are comparable in that the structure is very unlikely to change much over time. The amount of data, however, is going to grow quickly. Why should I use a relational database structure and if I need to, which of the solutions I tried to describe above is the best? Thanks again for your willingness to help. "Pat Hartman (MVP)" wrote: What you are asking for is called a "repeating group" in Relational database speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#8
|
|||
|
|||
How to work with 'multiple response sets' in Access
actually, Access is an excellent and very powerful tool for data analysis.
but you can't expect to use it the same way that you use Excel. Excel is specifically a flatfile data format, and Access is specifically a relational data format. the tools available in each software program are optimized to work with the data in its' particular format. if you prefer to work with a flatfile format, you'd be better off to stick with Excel. if you want to use the tools in Access to analyze your data, you need to embrace the concept of a relational structure for the data. otherwise, you're essentially wasting your time because you cannot fully, or easily, leverage the power of Access when the data is stored in a flatfile structure, period. to answer your question about migrating data into a normalized table structu i'm sure it's do-able, but i don't think i can offer any constructive suggestions until we share a common basic "relational" language. suggest you read up on the basics of data normalization, so the suggestions you get from me - or anyone else here - will make sense to you. for more information, see http://home.bendbroadband.com/conrad...abaseDesign101 hth "Wim" wrote in message ... Thanks again, Pat, for answering my question. But I have to say I'm not fully convinced yet. As for the first part of your answer, I'd say it just shows that for analysing data, Access simply isn't a good choice. Fortunately it's easy to exchange data between Access and Excel. But imagine I have two columns with numerical data and I want to calculate the correlation between them; that would be a lot easier in the "flattened structure" than in what you call a "normalized" structure. Also, I do not understand why you consider adding columns such a "nightmare". It doesn't seem that complicated to me. And I wouldn't have to do it very often. But let's get to my second question. Suppose I have my questionnaire data (several hundreds of records) in a flattened table, 60 columns long, all of the yes/no data type, and I want to transform them into a normalized table. Is there a way to do this, without having to type in everything for a second time? I suppose this has to be done by means of a query, but I just can't figure out how. Can you help? "Pat Hartman (MVP)" wrote: "That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way?" The advantage is that you can analyze it easily. To work with the flattened structure, you'll need a query for each separate column to analyze the data so the more columns you have, the more queries you'll need. Not to mention the nightmare of adding additional columns for future questionnaires. "Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines." If only one question is answered in the first scenario, you would have only 6000 rows. The extra advantage of the normal structure is only populated rows need to exist. There is no question in my mind that rows wins over columns. You will just have to try building queries based on both structures to convince yourself. Don't forget that relational databases do not support functions that work "across" the columns way that Excel does. In a relational query, functions operate on a domain which is defined by the number of rows selected in a query/table. If you wanted to determine which of the 60 columns contained the "yes", you'd need to interrogate the 60 columns specifically by name. With the normalized structure, you just have criteria that returns all the rows for a particular survey - one criteria verses 60 nested if statements. Give me the rows!!!! If your questionnaire is sparse (meaning that not all questions need to be answered), the normalized structure could likely take less overall space than the flat structure even though it takes considerably more rows. Access is capable of handling millions of rows efficiently. "Wim" wrote in message ... Pat, thanks for trying to help me. I realise that I was actually asking several questions in one. You were correct in pointing out that my questions were based on an example that does not comply with basic rules for the design of a relational database. So, let's start with that. My question is: what is the big advantage of this set-up in a case where the data are relatively static? Let's, for the sake of argument, presume I have a questionnaire, composed of a series of questions, each allowing 1 answer, to be chosen among a group of predefined answers. The way to store the information for statistical analysis in e.g. Excel, would be in one table, in which every line contains the answers to one questionnaire and every column the answers to one question. In Access, however, I understand that the correct way to store the same information would be to have one table in which each line contains one questionnaire's answer to one question. Now let's say (example 1) my questionnaire has 60 questions (not unusual) and that I have 6000 questionnaires (also not an unusual case). That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way? A slightly different situation occurs (example 2) when all of my 60 questions only allow for a "yes" or a "no" answer. In that case I could put into my table only the "yes"-answers, leaving out the "no"-answers. If, on average, every questionnaire had 10 "yes"-answers, that would reduce my table to 60,000 lines, still a bit bulky. Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines. But I could use this same idea in example 2. Instead of one line stating which of the answers received an affirmative answers, I would several. Again, if the average number of affirmative answers per questionnaire were 10, my table would be back to 60,000 lines. The data I work with are not really from questionnaires, but they are comparable in that the structure is very unlikely to change much over time. The amount of data, however, is going to grow quickly. Why should I use a relational database structure and if I need to, which of the solutions I tried to describe above is the best? Thanks again for your willingness to help. "Pat Hartman (MVP)" wrote: What you are asking for is called a "repeating group" in Relational database speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#9
|
|||
|
|||
How to work with 'multiple response sets' in Access
Tina, thanks for your reply.
I have actually read quite a lot about the theory of relational databases, entity-relationship modeling and the theory of normalisation. What I do not have is much experience. Also, I am not defending Excel or flatfile stucture as opposed to Access or relational structure. I am sorry if I gave you that impression. For part of the data I am working with, the relational structure seems to be the best option. My doubts are about part of my data that have a structure comparable to survey data. So, please be so kind to tell me what suggestions you have for me... Thanks. "tina" wrote: actually, Access is an excellent and very powerful tool for data analysis. but you can't expect to use it the same way that you use Excel. Excel is specifically a flatfile data format, and Access is specifically a relational data format. the tools available in each software program are optimized to work with the data in its' particular format. if you prefer to work with a flatfile format, you'd be better off to stick with Excel. if you want to use the tools in Access to analyze your data, you need to embrace the concept of a relational structure for the data. otherwise, you're essentially wasting your time because you cannot fully, or easily, leverage the power of Access when the data is stored in a flatfile structure, period. to answer your question about migrating data into a normalized table structu i'm sure it's do-able, but i don't think i can offer any constructive suggestions until we share a common basic "relational" language. suggest you read up on the basics of data normalization, so the suggestions you get from me - or anyone else here - will make sense to you. for more information, see http://home.bendbroadband.com/conrad...abaseDesign101 hth "Wim" wrote in message ... Thanks again, Pat, for answering my question. But I have to say I'm not fully convinced yet. As for the first part of your answer, I'd say it just shows that for analysing data, Access simply isn't a good choice. Fortunately it's easy to exchange data between Access and Excel. But imagine I have two columns with numerical data and I want to calculate the correlation between them; that would be a lot easier in the "flattened structure" than in what you call a "normalized" structure. Also, I do not understand why you consider adding columns such a "nightmare". It doesn't seem that complicated to me. And I wouldn't have to do it very often. But let's get to my second question. Suppose I have my questionnaire data (several hundreds of records) in a flattened table, 60 columns long, all of the yes/no data type, and I want to transform them into a normalized table. Is there a way to do this, without having to type in everything for a second time? I suppose this has to be done by means of a query, but I just can't figure out how. Can you help? "Pat Hartman (MVP)" wrote: "That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way?" The advantage is that you can analyze it easily. To work with the flattened structure, you'll need a query for each separate column to analyze the data so the more columns you have, the more queries you'll need. Not to mention the nightmare of adding additional columns for future questionnaires. "Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines." If only one question is answered in the first scenario, you would have only 6000 rows. The extra advantage of the normal structure is only populated rows need to exist. There is no question in my mind that rows wins over columns. You will just have to try building queries based on both structures to convince yourself. Don't forget that relational databases do not support functions that work "across" the columns way that Excel does. In a relational query, functions operate on a domain which is defined by the number of rows selected in a query/table. If you wanted to determine which of the 60 columns contained the "yes", you'd need to interrogate the 60 columns specifically by name. With the normalized structure, you just have criteria that returns all the rows for a particular survey - one criteria verses 60 nested if statements. Give me the rows!!!! If your questionnaire is sparse (meaning that not all questions need to be answered), the normalized structure could likely take less overall space than the flat structure even though it takes considerably more rows. Access is capable of handling millions of rows efficiently. "Wim" wrote in message ... Pat, thanks for trying to help me. I realise that I was actually asking several questions in one. You were correct in pointing out that my questions were based on an example that does not comply with basic rules for the design of a relational database. So, let's start with that. My question is: what is the big advantage of this set-up in a case where the data are relatively static? Let's, for the sake of argument, presume I have a questionnaire, composed of a series of questions, each allowing 1 answer, to be chosen among a group of predefined answers. The way to store the information for statistical analysis in e.g. Excel, would be in one table, in which every line contains the answers to one questionnaire and every column the answers to one question. In Access, however, I understand that the correct way to store the same information would be to have one table in which each line contains one questionnaire's answer to one question. Now let's say (example 1) my questionnaire has 60 questions (not unusual) and that I have 6000 questionnaires (also not an unusual case). That would mean that my table would have 360,000 lines! That does not seem very practical to me. Can you explain me what the big advantage is of storing the information this way? A slightly different situation occurs (example 2) when all of my 60 questions only allow for a "yes" or a "no" answer. In that case I could put into my table only the "yes"-answers, leaving out the "no"-answers. If, on average, every questionnaire had 10 "yes"-answers, that would reduce my table to 60,000 lines, still a bit bulky. Now let's consider (example 3) a questionnaire with 60 questions in which only one (not more, not less) can be answered "yes". In that case one line per questionnaire would be enough, stating only which of the 60 questions was answered affirmatively. My table would be reduced to 6000 lines. But I could use this same idea in example 2. Instead of one line stating which of the answers received an affirmative answers, I would several. Again, if the average number of affirmative answers per questionnaire were 10, my table would be back to 60,000 lines. The data I work with are not really from questionnaires, but they are comparable in that the structure is very unlikely to change much over time. The amount of data, however, is going to grow quickly. Why should I use a relational database structure and if I need to, which of the solutions I tried to describe above is the best? Thanks again for your willingness to help. "Pat Hartman (MVP)" wrote: What you are asking for is called a "repeating group" in Relational database speak. In a relational database, once you have more than one of something, you have "many" and that requires a separate table. It is poor practice to "flatten" the many items into many columns. One reason is that columns are fixed and rows are not. If you add a column to a table, you'll also need to do other maintenance such as modify queries, forms, reports, and code. However, adding a row doesn't require any application changes. There are several ways to implement your request. If you can upgrade to A2007 the functionality is built in. Otherwise you'll need to build it yourself. 1. Codeless way - create a subform to show the selected values. Use a combo to allow the user to choose. 2. VBA way - use a multi-select listbox to manage the data on a form. However, since the multi-select listbox is not a bound control, you will need to do all the reading/writing behind the scenes. When the user leaves the listbox, you will need to add newly selected rows and delete no longer selected rows from the many-side table. "Wim" wrote in message ... I have (in an Access 2003 database) a series of fields of the yes/no data type. I would like to transform them into fields that have the names/topics of the existing fields as their possible answer categories. Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3: read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three with options radio/tv/newspaper. A record containing yes - no - yes for the initial fields would read radio - newspaper - NULL on the new fields. (In SPSS this is called a "multiple response set" as the set of fields can contain the answer to a single questionnaire question that allows more than one response.) If it is not possible to transform the existing fields, is there a way of having my users fill out a form with simple checkboxes, but the answers being stored in the kind of fields as I explained above? Thank you for any suggestion. |
#10
|
|||
|
|||
How to work with 'multiple response sets' in Access
okay, let's see if i can offer something helpful, here. i'll base my
suggestions on the following, taken from a previous post in this thread: Suppose I have my questionnaire data (several hundreds of records) in a flattened table, 60 columns long, all of the yes/no data type, and I want to transform them into a normalized table. Is there a way to do this, without having to type in everything for a second time? I suppose this has to be done by means of a query, but I just can't figure out how. Can you help? basically, you need a table of questions, one record for each question. if you have 60 different questions, one for each column in the spreadsheet, then the table will hold 60 records, one question per record. Sixty is not so many, i'd just enter the questions manually - it'll probably be quicker than trying to migrate that data programmatically. you need a table of answer *options*. if all 60 questions have the same three answer options, then this table will only have three records. if some questions will have different options than others, then you have a many-to-many relationship between questions and options: one question may have many answer options, and one option may be available for many questions. in this case, your options table should list all possible options, one record for each option. then you'll need a third table to link options to questions - each record will store one question/answer option combination. so a question with three answer options will have three records in this table; a question with five answer options will have five records in this table, etc. then you need a table to store the actual options chosen for each question, by each survey respondent. you'll also need a table to store data that is *survey-specific*, rather than answer-specific, because you don't want repeating data stored with each answer on a particular completed survey. so your answers table will include a foreign key pointing back to a particular completed survey, and a foreign key pointing back to a particular question option (that linking table described in the paragraph above). note that you do *not* need a foreign key pointing to a record in the questions table, because that key value is already stored in the linking table. again, you really should study Duane Hookom's sample survey database to see how this is all laid out and connected. okay, that's basically the tables you'll need. migrating the data from a spreadsheet format will be ugly, no doubt about it - i wouldn't be surprised if it takes an entire day, or two, to do it. once you have the questions table, options table, and question-options table set up, and the spreadsheet data imported into its' own table, you should stop and back up your database. that way if you mess up and have to start over, you have a clean "original" to copy and begin again. add a primary key field to the spreadsheet table (an autonumber field will do fine). now each survey record is uniquely identified. use an Append query to copy the *survey-specific* data into the surveys table, making sure that you also append the primary key field. you'll have to migrate each answer column in the spreadsheet table *separately*. you're appending answer data from the spreadsheet table into the answers table; make sure you include the primary key field in the append. set criteria on the spreadsheet column you're working with: "yes" if it's a Text data type, True if it's a Yes/No (True/False) data type. manually enter the primary key value of the appropriate answer option from the answer options table, to be appended to the answer field in the answers table. the end result will be that, for the spreadsheet answer column you working with, all surveys having a Yes answer to that question will be migrated into the answers table using the appropriate key value from the answer options table. for instance, out of the 6000 records in the spreadsheet table, if 3010 answered yes in the first answer column, then the query will append 3010 records into the answers table. once you've migrated the first column of answer data from the spreadsheet table into the answers table, stop and check it for accuracy. create a Select query, linking the surveys table, the answers table, and the question options table, and the questions table. pull the survey fields, and the question option fields, and the questions fields into the grid, and view it to make sure you're seeing the correct answer for the correct question, for the appropriate number of survey records. if you're only doing this migration once, i wouldn't bother writing a separate Append query for each column and saving the query, etc. just write an Append query to handle migrating the first answer column, run it, then modify it to handle the second answer column, run it, etc. this is hard to explain in the abstract, so i hope you at least got an idea of how to do it, if not all the (fuzzy) details. hth |
Thread Tools | |
Display Modes | |
|
|