How to work with 'multiple response sets' in Access

**Wim** · February 15th, 2007, 11:59 PM posted to microsoft.public.access.tablesdbdesign

I have (in an Access 2003 database) a series of fields of the yes/no data
type. I would like to transform them into fields that have the names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field 3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all three
with options radio/tv/newspaper. A record containing yes - no - yes for the
initial fields would read radio - newspaper - NULL on the new fields.
(In SPSS this is called a "multiple response set" as the set of fields can
contain the answer to a single questionnaire question that allows more than
one response.)
If it is not possible to transform the existing fields, is there a way of
having my users fill out a form with simple checkboxes, but the answers being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**Pat Hartman \(MVP\)** · February 16th, 2007, 03:00 PM posted to microsoft.public.access.tablesdbdesign

What you are asking for is called a "repeating group" in Relational database
speak. In a relational database, once you have more than one of something,
you have "many" and that requires a separate table. It is poor practice to
"flatten" the many items into many columns. One reason is that columns are
fixed and rows are not. If you add a column to a table, you'll also need to
do other maintenance such as modify queries, forms, reports, and code.
However, adding a row doesn't require any application changes. There are
several ways to implement your request. If you can upgrade to A2007 the
functionality is built in. Otherwise you'll need to build it yourself.
1. Codeless way - create a subform to show the selected values. Use a combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you will
need to do all the reading/writing behind the scenes. When the user leaves
the listbox, you will need to add newly selected rows and delete no longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all
three
with options radio/tv/newspaper. A record containing yes - no - yes for
the
initial fields would read radio - newspaper - NULL on the new fields.
(In SPSS this is called a "multiple response set" as the set of fields can
contain the answer to a single questionnaire question that allows more
than
one response.)
If it is not possible to transform the existing fields, is there a way of
having my users fill out a form with simple checkboxes, but the answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**Wim** · February 23rd, 2007, 03:06 PM posted to microsoft.public.access.tablesdbdesign

Pat, thanks for trying to help me. I realise that I was actually asking
several questions in one. You were correct in pointing out that my questions
were based on an example that does not comply with basic rules for the design
of a relational database. So, let's start with that. My question is: what is
the big advantage of this set-up in a case where the data are relatively
static?

Letâ€™s, for the sake of argument, presume I have a questionnaire, composed of
a series of questions, each allowing 1 answer, to be chosen among a group of
predefined answers.
The way to store the information for statistical analysis in e.g. Excel,
would be in one table, in which every line contains the answers to one
questionnaire and every column the answers to one question.
In Access, however, I understand that the correct way to store the same
information would be to have one table in which each line contains one
questionnaireâ€™s answer to one question. Now letâ€™s say (example 1) my
questionnaire has 60 questions (not unusual) and that I have 6000
questionnaires (also not an unusual case). That would mean that my table
would have 360,000 lines! That does not seem very practical to me.
Can you explain me what the big advantage is of storing the information this
way?

A slightly different situation occurs (example 2) when all of my 60
questions only allow for a â€œyesâ€ or a â€œnoâ€ answer. In that case I could put
into my table only the â€œyesâ€-answers, leaving out the â€œnoâ€-answers. If, on
average, every questionnaire had 10 â€œyesâ€-answers, that would reduce my table
to 60,000 lines, still a bit bulky.

Now letâ€™s consider (example 3) a questionnaire with 60 questions in which
only one (not more, not less) can be answered â€œyesâ€. In that case one line
per questionnaire would be enough, stating only which of the 60 questions was
answered affirmatively. My table would be reduced to 6000 lines.

But I could use this same idea in example 2. Instead of one line stating
which of the answers received an affirmative answers, I would several. Again,
if the average number of affirmative answers per questionnaire were 10, my
table would be back to 60,000 lines.

The data I work with are not really from questionnaires, but they are
comparable in that the structure is very unlikely to change much over time.
The amount of data, however, is going to grow quickly.
Why should I use a relational database structure and if I need to, which of
the solutions I tried to describe above is the best?

Thanks again for your willingness to help.

"Pat Hartman (MVP)" wrote:

What you are asking for is called a "repeating group" in Relational database
speak. In a relational database, once you have more than one of something,
you have "many" and that requires a separate table. It is poor practice to
"flatten" the many items into many columns. One reason is that columns are
fixed and rows are not. If you add a column to a table, you'll also need to
do other maintenance such as modify queries, forms, reports, and code.
However, adding a row doesn't require any application changes. There are
several ways to implement your request. If you can upgrade to A2007 the
functionality is built in. Otherwise you'll need to build it yourself.
1. Codeless way - create a subform to show the selected values. Use a combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you will
need to do all the reading/writing behind the scenes. When the user leaves
the listbox, you will need to add newly selected rows and delete no longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no, Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all
three
with options radio/tv/newspaper. A record containing yes - no - yes for
the
initial fields would read radio - newspaper - NULL on the new fields.
(In SPSS this is called a "multiple response set" as the set of fields can
contain the answer to a single questionnaire question that allows more
than
one response.)
If it is not possible to transform the existing fields, is there a way of
having my users fill out a form with simple checkboxes, but the answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**Pat Hartman \(MVP\)** · February 23rd, 2007, 09:42 PM posted to microsoft.public.access.tablesdbdesign

"That would mean that my table would have 360,000 lines! That does not seem
very practical to me.
Can you explain me what the big advantage is of storing the information this
way?"
The advantage is that you can analyze it easily. To work with the flattened
structure, you'll need a query for each separate column to analyze the data
so the more columns you have, the more queries you'll need. Not to mention
the nightmare of adding additional columns for future questionnaires.

"Now let's consider (example 3) a questionnaire with 60 questions in which
only one (not more, not less) can be answered "yes". In that case one line
per questionnaire would be enough, stating only which of the 60 questions
was
answered affirmatively. My table would be reduced to 6000 lines."
If only one question is answered in the first scenario, you would have only
6000 rows. The extra advantage of the normal structure is only populated
rows need to exist.

There is no question in my mind that rows wins over columns. You will just
have to try building queries based on both structures to convince yourself.
Don't forget that relational databases do not support functions that work
"across" the columns way that Excel does. In a relational query, functions
operate on a domain which is defined by the number of rows selected in a
query/table. If you wanted to determine which of the 60 columns contained
the "yes", you'd need to interrogate the 60 columns specifically by name.
With the normalized structure, you just have criteria that returns all the
rows for a particular survey - one criteria verses 60 nested if statements.
Give me the rows!!!!

If your questionnaire is sparse (meaning that not all questions need to be
answered), the normalized structure could likely take less overall space
than the flat structure even though it takes considerably more rows. Access
is capable of handling millions of rows efficiently.

"Wim" wrote in message
...
Pat, thanks for trying to help me. I realise that I was actually asking
several questions in one. You were correct in pointing out that my
questions
were based on an example that does not comply with basic rules for the
design
of a relational database. So, let's start with that. My question is: what
is
the big advantage of this set-up in a case where the data are relatively
static?

Let's, for the sake of argument, presume I have a questionnaire, composed
of
a series of questions, each allowing 1 answer, to be chosen among a group
of
predefined answers.
The way to store the information for statistical analysis in e.g. Excel,
would be in one table, in which every line contains the answers to one
questionnaire and every column the answers to one question.
In Access, however, I understand that the correct way to store the same
information would be to have one table in which each line contains one
questionnaire's answer to one question. Now let's say (example 1) my
questionnaire has 60 questions (not unusual) and that I have 6000
questionnaires (also not an unusual case). That would mean that my table
would have 360,000 lines! That does not seem very practical to me.
Can you explain me what the big advantage is of storing the information
this
way?

A slightly different situation occurs (example 2) when all of my 60
questions only allow for a "yes" or a "no" answer. In that case I could
put
into my table only the "yes"-answers, leaving out the "no"-answers. If, on
average, every questionnaire had 10 "yes"-answers, that would reduce my
table
to 60,000 lines, still a bit bulky.

Now let's consider (example 3) a questionnaire with 60 questions in which
only one (not more, not less) can be answered "yes". In that case one line
per questionnaire would be enough, stating only which of the 60 questions
was
answered affirmatively. My table would be reduced to 6000 lines.

But I could use this same idea in example 2. Instead of one line stating
which of the answers received an affirmative answers, I would several.
Again,
if the average number of affirmative answers per questionnaire were 10, my
table would be back to 60,000 lines.

The data I work with are not really from questionnaires, but they are
comparable in that the structure is very unlikely to change much over
time.
The amount of data, however, is going to grow quickly.
Why should I use a relational database structure and if I need to, which
of
the solutions I tried to describe above is the best?

Thanks again for your willingness to help.

"Pat Hartman (MVP)" wrote:

What you are asking for is called a "repeating group" in Relational
database
speak. In a relational database, once you have more than one of
something,
you have "many" and that requires a separate table. It is poor practice
to
"flatten" the many items into many columns. One reason is that columns
are
fixed and rows are not. If you add a column to a table, you'll also need
to
do other maintenance such as modify queries, forms, reports, and code.
However, adding a row doesn't require any application changes. There are
several ways to implement your request. If you can upgrade to A2007 the
functionality is built in. Otherwise you'll need to build it yourself.
1. Codeless way - create a subform to show the selected values. Use a
combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you will
need to do all the reading/writing behind the scenes. When the user
leaves
the listbox, you will need to add newly selected rows and delete no
longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no
data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no,
Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all
three
with options radio/tv/newspaper. A record containing yes - no - yes for
the
initial fields would read radio - newspaper - NULL on the new fields.
(In SPSS this is called a "multiple response set" as the set of fields
can
contain the answer to a single questionnaire question that allows more
than
one response.)
If it is not possible to transform the existing fields, is there a way
of
having my users fill out a form with simple checkboxes, but the answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**tina** · February 24th, 2007, 09:08 AM posted to microsoft.public.access.tablesdbdesign

Wim, you might also benefit from studying the often-recommended sample
survey database created by MVP Duane Hookom, at
http://www.rogersaccesslibrary.com/O...p#Hookom,Duane.

hth

"Wim" wrote in message
...
Pat, thanks for trying to help me. I realise that I was actually asking
several questions in one. You were correct in pointing out that my
questions
were based on an example that does not comply with basic rules for the
design
of a relational database. So, let's start with that. My question is: what
is
the big advantage of this set-up in a case where the data are relatively
static?

Let's, for the sake of argument, presume I have a questionnaire, composed
of
a series of questions, each allowing 1 answer, to be chosen among a group
of
predefined answers.
The way to store the information for statistical analysis in e.g. Excel,
would be in one table, in which every line contains the answers to one
questionnaire and every column the answers to one question.
In Access, however, I understand that the correct way to store the same
information would be to have one table in which each line contains one
questionnaire's answer to one question. Now let's say (example 1) my
questionnaire has 60 questions (not unusual) and that I have 6000
questionnaires (also not an unusual case). That would mean that my table
would have 360,000 lines! That does not seem very practical to me.
Can you explain me what the big advantage is of storing the information
this
way?

A slightly different situation occurs (example 2) when all of my 60
questions only allow for a "yes" or a "no" answer. In that case I could
put
into my table only the "yes"-answers, leaving out the "no"-answers. If, on
average, every questionnaire had 10 "yes"-answers, that would reduce my
table
to 60,000 lines, still a bit bulky.

Now let's consider (example 3) a questionnaire with 60 questions in which
only one (not more, not less) can be answered "yes". In that case one line
per questionnaire would be enough, stating only which of the 60 questions
was
answered affirmatively. My table would be reduced to 6000 lines.

But I could use this same idea in example 2. Instead of one line stating
which of the answers received an affirmative answers, I would several.
Again,
if the average number of affirmative answers per questionnaire were 10, my
table would be back to 60,000 lines.

The data I work with are not really from questionnaires, but they are
comparable in that the structure is very unlikely to change much over
time.
The amount of data, however, is going to grow quickly.
Why should I use a relational database structure and if I need to, which
of
the solutions I tried to describe above is the best?

Thanks again for your willingness to help.

"Pat Hartman (MVP)" wrote:

What you are asking for is called a "repeating group" in Relational
database
speak. In a relational database, once you have more than one of
something,
you have "many" and that requires a separate table. It is poor practice
to
"flatten" the many items into many columns. One reason is that columns
are
fixed and rows are not. If you add a column to a table, you'll also
need to
do other maintenance such as modify queries, forms, reports, and code.
However, adding a row doesn't require any application changes. There
are
several ways to implement your request. If you can upgrade to A2007 the
functionality is built in. Otherwise you'll need to build it yourself.
1. Codeless way - create a subform to show the selected values. Use a
combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you will
need to do all the reading/writing behind the scenes. When the user
leaves
the listbox, you will need to add newly selected rows and delete no
longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no
data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no,
Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3,
all
three
with options radio/tv/newspaper. A record containing yes - no - yes
for
the
initial fields would read radio - newspaper - NULL on the new fields.
(In SPSS this is called a "multiple response set" as the set of fields
can
contain the answer to a single questionnaire question that allows more
than
one response.)
If it is not possible to transform the existing fields, is there a way
of
having my users fill out a form with simple checkboxes, but the
answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**Wim** · February 24th, 2007, 01:39 PM posted to microsoft.public.access.tablesdbdesign

Thanks again, Pat, for answering my question. But I have to say Iâ€™m not fully
convinced yet.

As for the first part of your answer, Iâ€™d say it just shows that for
analysing data, Access simply isnâ€™t a good choice. Fortunately itâ€™s easy to
exchange data between Access and Excel. But imagine I have two columns with
numerical data and I want to calculate the correlation between them; that
would be a lot easier in the â€œflattened structureâ€ than in what you call a
â€œnormalizedâ€ structure. Also, I do not understand why you consider adding
columns such a â€œnightmareâ€. It doesnâ€™t seem that complicated to me. And I
wouldnâ€™t have to do it very often.

But letâ€™s get to my second question. Suppose I have my questionnaire data
(several hundreds of records) in a flattened table, 60 columns long, all of
the yes/no data type, and I want to transform them into a normalized table.
Is there a way to do this, without having to type in everything for a second
time? I suppose this has to be done by means of a query, but I just canâ€™t
figure out how. Can you help?

"Pat Hartman (MVP)" wrote:

"That would mean that my table would have 360,000 lines! That does not seem
very practical to me.
Can you explain me what the big advantage is of storing the information this
way?"
The advantage is that you can analyze it easily. To work with the flattened
structure, you'll need a query for each separate column to analyze the data
so the more columns you have, the more queries you'll need. Not to mention
the nightmare of adding additional columns for future questionnaires.

"Now let's consider (example 3) a questionnaire with 60 questions in which
only one (not more, not less) can be answered "yes". In that case one line
per questionnaire would be enough, stating only which of the 60 questions
was
answered affirmatively. My table would be reduced to 6000 lines."
If only one question is answered in the first scenario, you would have only
6000 rows. The extra advantage of the normal structure is only populated
rows need to exist.

There is no question in my mind that rows wins over columns. You will just
have to try building queries based on both structures to convince yourself.
Don't forget that relational databases do not support functions that work
"across" the columns way that Excel does. In a relational query, functions
operate on a domain which is defined by the number of rows selected in a
query/table. If you wanted to determine which of the 60 columns contained
the "yes", you'd need to interrogate the 60 columns specifically by name.
With the normalized structure, you just have criteria that returns all the
rows for a particular survey - one criteria verses 60 nested if statements.
Give me the rows!!!!

If your questionnaire is sparse (meaning that not all questions need to be
answered), the normalized structure could likely take less overall space
than the flat structure even though it takes considerably more rows. Access
is capable of handling millions of rows efficiently.

"Wim" wrote in message
...
Pat, thanks for trying to help me. I realise that I was actually asking
several questions in one. You were correct in pointing out that my
questions
were based on an example that does not comply with basic rules for the
design
of a relational database. So, let's start with that. My question is: what
is
the big advantage of this set-up in a case where the data are relatively
static?

Let's, for the sake of argument, presume I have a questionnaire, composed
of
a series of questions, each allowing 1 answer, to be chosen among a group
of
predefined answers.
The way to store the information for statistical analysis in e.g. Excel,
would be in one table, in which every line contains the answers to one
questionnaire and every column the answers to one question.
In Access, however, I understand that the correct way to store the same
information would be to have one table in which each line contains one
questionnaire's answer to one question. Now let's say (example 1) my
questionnaire has 60 questions (not unusual) and that I have 6000
questionnaires (also not an unusual case). That would mean that my table
would have 360,000 lines! That does not seem very practical to me.
Can you explain me what the big advantage is of storing the information
this
way?

A slightly different situation occurs (example 2) when all of my 60
questions only allow for a "yes" or a "no" answer. In that case I could
put
into my table only the "yes"-answers, leaving out the "no"-answers. If, on
average, every questionnaire had 10 "yes"-answers, that would reduce my
table
to 60,000 lines, still a bit bulky.

Now let's consider (example 3) a questionnaire with 60 questions in which
only one (not more, not less) can be answered "yes". In that case one line
per questionnaire would be enough, stating only which of the 60 questions
was
answered affirmatively. My table would be reduced to 6000 lines.

But I could use this same idea in example 2. Instead of one line stating
which of the answers received an affirmative answers, I would several.
Again,
if the average number of affirmative answers per questionnaire were 10, my
table would be back to 60,000 lines.

The data I work with are not really from questionnaires, but they are
comparable in that the structure is very unlikely to change much over
time.
The amount of data, however, is going to grow quickly.
Why should I use a relational database structure and if I need to, which
of
the solutions I tried to describe above is the best?

Thanks again for your willingness to help.

"Pat Hartman (MVP)" wrote:

What you are asking for is called a "repeating group" in Relational
database
speak. In a relational database, once you have more than one of
something,
you have "many" and that requires a separate table. It is poor practice
to
"flatten" the many items into many columns. One reason is that columns
are
fixed and rows are not. If you add a column to a table, you'll also need
to
do other maintenance such as modify queries, forms, reports, and code.
However, adding a row doesn't require any application changes. There are
several ways to implement your request. If you can upgrade to A2007 the
functionality is built in. Otherwise you'll need to build it yourself.
1. Codeless way - create a subform to show the selected values. Use a
combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you will
need to do all the reading/writing behind the scenes. When the user
leaves
the listbox, you will need to add newly selected rows and delete no
longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no
data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no,
Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3, all
three
with options radio/tv/newspaper. A record containing yes - no - yes for
the
initial fields would read radio - newspaper - NULL on the new fields.
(In SPSS this is called a "multiple response set" as the set of fields
can
contain the answer to a single questionnaire question that allows more
than
one response.)
If it is not possible to transform the existing fields, is there a way
of
having my users fill out a form with simple checkboxes, but the answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**Wim** · February 24th, 2007, 02:48 PM posted to microsoft.public.access.tablesdbdesign

Thanks for the tip, tina, I will have a good look at it.

"tina" wrote:

Wim, you might also benefit from studying the often-recommended sample
survey database created by MVP Duane Hookom, at
http://www.rogersaccesslibrary.com/O...p#Hookom,Duane.

hth

"Wim" wrote in message
...
Pat, thanks for trying to help me. I realise that I was actually asking
several questions in one. You were correct in pointing out that my
questions
were based on an example that does not comply with basic rules for the
design
of a relational database. So, let's start with that. My question is: what
is
the big advantage of this set-up in a case where the data are relatively
static?

Let's, for the sake of argument, presume I have a questionnaire, composed
of
a series of questions, each allowing 1 answer, to be chosen among a group
of
predefined answers.
The way to store the information for statistical analysis in e.g. Excel,
would be in one table, in which every line contains the answers to one
questionnaire and every column the answers to one question.
In Access, however, I understand that the correct way to store the same
information would be to have one table in which each line contains one
questionnaire's answer to one question. Now let's say (example 1) my
questionnaire has 60 questions (not unusual) and that I have 6000
questionnaires (also not an unusual case). That would mean that my table
would have 360,000 lines! That does not seem very practical to me.
Can you explain me what the big advantage is of storing the information
this
way?

A slightly different situation occurs (example 2) when all of my 60
questions only allow for a "yes" or a "no" answer. In that case I could
put
into my table only the "yes"-answers, leaving out the "no"-answers. If, on
average, every questionnaire had 10 "yes"-answers, that would reduce my
table
to 60,000 lines, still a bit bulky.

Now let's consider (example 3) a questionnaire with 60 questions in which
only one (not more, not less) can be answered "yes". In that case one line
per questionnaire would be enough, stating only which of the 60 questions
was
answered affirmatively. My table would be reduced to 6000 lines.

But I could use this same idea in example 2. Instead of one line stating
which of the answers received an affirmative answers, I would several.
Again,
if the average number of affirmative answers per questionnaire were 10, my
table would be back to 60,000 lines.

The data I work with are not really from questionnaires, but they are
comparable in that the structure is very unlikely to change much over
time.
The amount of data, however, is going to grow quickly.
Why should I use a relational database structure and if I need to, which
of
the solutions I tried to describe above is the best?

Thanks again for your willingness to help.

"Pat Hartman (MVP)" wrote:

What you are asking for is called a "repeating group" in Relational
database
speak. In a relational database, once you have more than one of
something,
you have "many" and that requires a separate table. It is poor practice
to
"flatten" the many items into many columns. One reason is that columns
are
fixed and rows are not. If you add a column to a table, you'll also
need to
do other maintenance such as modify queries, forms, reports, and code.
However, adding a row doesn't require any application changes. There
are
several ways to implement your request. If you can upgrade to A2007 the
functionality is built in. Otherwise you'll need to build it yourself.
1. Codeless way - create a subform to show the selected values. Use a
combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you will
need to do all the reading/writing behind the scenes. When the user
leaves
the listbox, you will need to add newly selected rows and delete no
longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no
data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no,
Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3,
all
three
with options radio/tv/newspaper. A record containing yes - no - yes
for
the
initial fields would read radio - newspaper - NULL on the new fields.
(In SPSS this is called a "multiple response set" as the set of fields
can
contain the answer to a single questionnaire question that allows more
than
one response.)
If it is not possible to transform the existing fields, is there a way
of
having my users fill out a form with simple checkboxes, but the
answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**tina** · February 24th, 2007, 08:31 PM posted to microsoft.public.access.tablesdbdesign

actually, Access is an excellent and very powerful tool for data analysis.
but you can't expect to use it the same way that you use Excel. Excel is
specifically a flatfile data format, and Access is specifically a relational
data format. the tools available in each software program are optimized to
work with the data in its' particular format. if you prefer to work with a
flatfile format, you'd be better off to stick with Excel.

if you want to use the tools in Access to analyze your data, you need to
embrace the concept of a relational structure for the data. otherwise,
you're essentially wasting your time because you cannot fully, or easily,
leverage the power of Access when the data is stored in a flatfile
structure, period.

to answer your question about migrating data into a normalized table
structu i'm sure it's do-able, but i don't think i can offer any
constructive suggestions until we share a common basic "relational"
language. suggest you read up on the basics of data normalization, so the
suggestions you get from me - or anyone else here - will make sense to you.
for more information, see
http://home.bendbroadband.com/conrad...abaseDesign101

hth

"Wim" wrote in message
...
Thanks again, Pat, for answering my question. But I have to say I'm not
fully
convinced yet.

As for the first part of your answer, I'd say it just shows that for
analysing data, Access simply isn't a good choice. Fortunately it's easy
to
exchange data between Access and Excel. But imagine I have two columns
with
numerical data and I want to calculate the correlation between them; that
would be a lot easier in the "flattened structure" than in what you call a
"normalized" structure. Also, I do not understand why you consider adding
columns such a "nightmare". It doesn't seem that complicated to me. And I
wouldn't have to do it very often.

But let's get to my second question. Suppose I have my questionnaire data
(several hundreds of records) in a flattened table, 60 columns long, all
of
the yes/no data type, and I want to transform them into a normalized
table.
Is there a way to do this, without having to type in everything for a
second
time? I suppose this has to be done by means of a query, but I just can't
figure out how. Can you help?

"Pat Hartman (MVP)" wrote:

"That would mean that my table would have 360,000 lines! That does not
seem
very practical to me.
Can you explain me what the big advantage is of storing the information
this
way?"
The advantage is that you can analyze it easily. To work with the
flattened
structure, you'll need a query for each separate column to analyze the
data
so the more columns you have, the more queries you'll need. Not to
mention
the nightmare of adding additional columns for future questionnaires.

"Now let's consider (example 3) a questionnaire with 60 questions in
which
only one (not more, not less) can be answered "yes". In that case one
line
per questionnaire would be enough, stating only which of the 60
questions
was
answered affirmatively. My table would be reduced to 6000 lines."
If only one question is answered in the first scenario, you would have
only
6000 rows. The extra advantage of the normal structure is only
populated
rows need to exist.

There is no question in my mind that rows wins over columns. You will
just
have to try building queries based on both structures to convince
yourself.
Don't forget that relational databases do not support functions that
work
"across" the columns way that Excel does. In a relational query,
functions
operate on a domain which is defined by the number of rows selected in a
query/table. If you wanted to determine which of the 60 columns
contained
the "yes", you'd need to interrogate the 60 columns specifically by
name.
With the normalized structure, you just have criteria that returns all
the
rows for a particular survey - one criteria verses 60 nested if
statements.
Give me the rows!!!!

If your questionnaire is sparse (meaning that not all questions need to
be
answered), the normalized structure could likely take less overall space
than the flat structure even though it takes considerably more rows.
Access
is capable of handling millions of rows efficiently.

"Wim" wrote in message
...
Pat, thanks for trying to help me. I realise that I was actually
asking
several questions in one. You were correct in pointing out that my
questions
were based on an example that does not comply with basic rules for the
design
of a relational database. So, let's start with that. My question is:
what
is
the big advantage of this set-up in a case where the data are
relatively
static?

Let's, for the sake of argument, presume I have a questionnaire,
composed
of
a series of questions, each allowing 1 answer, to be chosen among a
group
of
predefined answers.
The way to store the information for statistical analysis in e.g.
Excel,
would be in one table, in which every line contains the answers to one
questionnaire and every column the answers to one question.
In Access, however, I understand that the correct way to store the
same
information would be to have one table in which each line contains one
questionnaire's answer to one question. Now let's say (example 1) my
questionnaire has 60 questions (not unusual) and that I have 6000
questionnaires (also not an unusual case). That would mean that my
table
would have 360,000 lines! That does not seem very practical to me.
Can you explain me what the big advantage is of storing the
information
this
way?

A slightly different situation occurs (example 2) when all of my 60
questions only allow for a "yes" or a "no" answer. In that case I
could
put
into my table only the "yes"-answers, leaving out the "no"-answers.
If, on
average, every questionnaire had 10 "yes"-answers, that would reduce
my
table
to 60,000 lines, still a bit bulky.

Now let's consider (example 3) a questionnaire with 60 questions in
which
only one (not more, not less) can be answered "yes". In that case one
line
per questionnaire would be enough, stating only which of the 60
questions
was
answered affirmatively. My table would be reduced to 6000 lines.

But I could use this same idea in example 2. Instead of one line
stating
which of the answers received an affirmative answers, I would several.
Again,
if the average number of affirmative answers per questionnaire were
10, my
table would be back to 60,000 lines.

The data I work with are not really from questionnaires, but they are
comparable in that the structure is very unlikely to change much over
time.
The amount of data, however, is going to grow quickly.
Why should I use a relational database structure and if I need to,
which
of
the solutions I tried to describe above is the best?

Thanks again for your willingness to help.

"Pat Hartman (MVP)" wrote:

What you are asking for is called a "repeating group" in Relational
database
speak. In a relational database, once you have more than one of
something,
you have "many" and that requires a separate table. It is poor
practice
to
"flatten" the many items into many columns. One reason is that
columns
are
fixed and rows are not. If you add a column to a table, you'll also
need
to
do other maintenance such as modify queries, forms, reports, and
code.
However, adding a row doesn't require any application changes. There
are
several ways to implement your request. If you can upgrade to A2007
the
functionality is built in. Otherwise you'll need to build it
yourself.
1. Codeless way - create a subform to show the selected values. Use
a
combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you
will
need to do all the reading/writing behind the scenes. When the user
leaves
the listbox, you will need to add newly selected rows and delete no
longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no
data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no,
Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3,
all
three
with options radio/tv/newspaper. A record containing yes - no - yes
for
the
initial fields would read radio - newspaper - NULL on the new
fields.
(In SPSS this is called a "multiple response set" as the set of
fields
can
contain the answer to a single questionnaire question that allows
more
than
one response.)
If it is not possible to transform the existing fields, is there a
way
of
having my users fill out a form with simple checkboxes, but the
answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**Wim** · February 24th, 2007, 10:22 PM posted to microsoft.public.access.tablesdbdesign

Tina, thanks for your reply.
I have actually read quite a lot about the theory of relational databases,
entity-relationship modeling and the theory of normalisation. What I do not
have is much experience.
Also, I am not defending Excel or flatfile stucture as opposed to Access or
relational structure. I am sorry if I gave you that impression. For part of
the data I am working with, the relational structure seems to be the best
option. My doubts are about part of my data that have a structure comparable
to survey data.
So, please be so kind to tell me what suggestions you have for me...
Thanks.

"tina" wrote:

actually, Access is an excellent and very powerful tool for data analysis.
but you can't expect to use it the same way that you use Excel. Excel is
specifically a flatfile data format, and Access is specifically a relational
data format. the tools available in each software program are optimized to
work with the data in its' particular format. if you prefer to work with a
flatfile format, you'd be better off to stick with Excel.

if you want to use the tools in Access to analyze your data, you need to
embrace the concept of a relational structure for the data. otherwise,
you're essentially wasting your time because you cannot fully, or easily,
leverage the power of Access when the data is stored in a flatfile
structure, period.

to answer your question about migrating data into a normalized table
structu i'm sure it's do-able, but i don't think i can offer any
constructive suggestions until we share a common basic "relational"
language. suggest you read up on the basics of data normalization, so the
suggestions you get from me - or anyone else here - will make sense to you.
for more information, see
http://home.bendbroadband.com/conrad...abaseDesign101

hth

"Wim" wrote in message
...
Thanks again, Pat, for answering my question. But I have to say I'm not
fully
convinced yet.

As for the first part of your answer, I'd say it just shows that for
analysing data, Access simply isn't a good choice. Fortunately it's easy
to
exchange data between Access and Excel. But imagine I have two columns
with
numerical data and I want to calculate the correlation between them; that
would be a lot easier in the "flattened structure" than in what you call a
"normalized" structure. Also, I do not understand why you consider adding
columns such a "nightmare". It doesn't seem that complicated to me. And I
wouldn't have to do it very often.

But let's get to my second question. Suppose I have my questionnaire data
(several hundreds of records) in a flattened table, 60 columns long, all
of
the yes/no data type, and I want to transform them into a normalized
table.
Is there a way to do this, without having to type in everything for a
second
time? I suppose this has to be done by means of a query, but I just can't
figure out how. Can you help?

"Pat Hartman (MVP)" wrote:

"That would mean that my table would have 360,000 lines! That does not
seem
very practical to me.
Can you explain me what the big advantage is of storing the information
this
way?"
The advantage is that you can analyze it easily. To work with the
flattened
structure, you'll need a query for each separate column to analyze the
data
so the more columns you have, the more queries you'll need. Not to
mention
the nightmare of adding additional columns for future questionnaires.

"Now let's consider (example 3) a questionnaire with 60 questions in
which
only one (not more, not less) can be answered "yes". In that case one
line
per questionnaire would be enough, stating only which of the 60
questions
was
answered affirmatively. My table would be reduced to 6000 lines."
If only one question is answered in the first scenario, you would have
only
6000 rows. The extra advantage of the normal structure is only
populated
rows need to exist.

There is no question in my mind that rows wins over columns. You will
just
have to try building queries based on both structures to convince
yourself.
Don't forget that relational databases do not support functions that
work
"across" the columns way that Excel does. In a relational query,
functions
operate on a domain which is defined by the number of rows selected in a
query/table. If you wanted to determine which of the 60 columns
contained
the "yes", you'd need to interrogate the 60 columns specifically by
name.
With the normalized structure, you just have criteria that returns all
the
rows for a particular survey - one criteria verses 60 nested if
statements.
Give me the rows!!!!

If your questionnaire is sparse (meaning that not all questions need to
be
answered), the normalized structure could likely take less overall space
than the flat structure even though it takes considerably more rows.
Access
is capable of handling millions of rows efficiently.

"Wim" wrote in message
...
Pat, thanks for trying to help me. I realise that I was actually
asking
several questions in one. You were correct in pointing out that my
questions
were based on an example that does not comply with basic rules for the
design
of a relational database. So, let's start with that. My question is:
what
is
the big advantage of this set-up in a case where the data are
relatively
static?

Let's, for the sake of argument, presume I have a questionnaire,
composed
of
a series of questions, each allowing 1 answer, to be chosen among a
group
of
predefined answers.
The way to store the information for statistical analysis in e.g.
Excel,
would be in one table, in which every line contains the answers to one
questionnaire and every column the answers to one question.
In Access, however, I understand that the correct way to store the
same
information would be to have one table in which each line contains one
questionnaire's answer to one question. Now let's say (example 1) my
questionnaire has 60 questions (not unusual) and that I have 6000
questionnaires (also not an unusual case). That would mean that my
table
would have 360,000 lines! That does not seem very practical to me.
Can you explain me what the big advantage is of storing the
information
this
way?

A slightly different situation occurs (example 2) when all of my 60
questions only allow for a "yes" or a "no" answer. In that case I
could
put
into my table only the "yes"-answers, leaving out the "no"-answers.
If, on
average, every questionnaire had 10 "yes"-answers, that would reduce
my
table
to 60,000 lines, still a bit bulky.

Now let's consider (example 3) a questionnaire with 60 questions in
which
only one (not more, not less) can be answered "yes". In that case one
line
per questionnaire would be enough, stating only which of the 60
questions
was
answered affirmatively. My table would be reduced to 6000 lines.

But I could use this same idea in example 2. Instead of one line
stating
which of the answers received an affirmative answers, I would several.
Again,
if the average number of affirmative answers per questionnaire were
10, my
table would be back to 60,000 lines.

The data I work with are not really from questionnaires, but they are
comparable in that the structure is very unlikely to change much over
time.
The amount of data, however, is going to grow quickly.
Why should I use a relational database structure and if I need to,
which
of
the solutions I tried to describe above is the best?

Thanks again for your willingness to help.

"Pat Hartman (MVP)" wrote:

What you are asking for is called a "repeating group" in Relational
database
speak. In a relational database, once you have more than one of
something,
you have "many" and that requires a separate table. It is poor
practice
to
"flatten" the many items into many columns. One reason is that
columns
are
fixed and rows are not. If you add a column to a table, you'll also
need
to
do other maintenance such as modify queries, forms, reports, and
code.
However, adding a row doesn't require any application changes. There
are
several ways to implement your request. If you can upgrade to A2007
the
functionality is built in. Otherwise you'll need to build it
yourself.
1. Codeless way - create a subform to show the selected values. Use
a
combo
to allow the user to choose.
2. VBA way - use a multi-select listbox to manage the data on a form.
However, since the multi-select listbox is not a bound control, you
will
need to do all the reading/writing behind the scenes. When the user
leaves
the listbox, you will need to add newly selected rows and delete no
longer
selected rows from the many-side table.

"Wim" wrote in message
...
I have (in an Access 2003 database) a series of fields of the yes/no
data
type. I would like to transform them into fields that have the
names/topics
of the existing fields as their possible answer categories.
Example: Field 1: listen to radio yes/no, Field 2: watch tv yes/no,
Field
3:
read newspaper yes/no - to be transformed into Newfield 1, 2 and 3,
all
three
with options radio/tv/newspaper. A record containing yes - no - yes
for
the
initial fields would read radio - newspaper - NULL on the new
fields.
(In SPSS this is called a "multiple response set" as the set of
fields
can
contain the answer to a single questionnaire question that allows
more
than
one response.)
If it is not possible to transform the existing fields, is there a
way
of
having my users fill out a form with simple checkboxes, but the
answers
being
stored in the kind of fields as I explained above?
Thank you for any suggestion.

**tina** · February 25th, 2007, 07:01 PM posted to microsoft.public.access.tablesdbdesign

okay, let's see if i can offer something helpful, here. i'll base my
suggestions on the following, taken from a previous post in this thread:

Suppose I have my questionnaire data
(several hundreds of records) in a flattened table, 60 columns long, all
of
the yes/no data type, and I want to transform them into a normalized
table.
Is there a way to do this, without having to type in everything for a
second
time? I suppose this has to be done by means of a query, but I just can't
figure out how. Can you help?

basically, you need a table of questions, one record for each question. if
you have 60 different questions, one for each column in the spreadsheet,
then the table will hold 60 records, one question per record. Sixty is not
so many, i'd just enter the questions manually - it'll probably be quicker
than trying to migrate that data programmatically.

you need a table of answer *options*. if all 60 questions have the same
three answer options, then this table will only have three records. if some
questions will have different options than others, then you have a
many-to-many relationship between questions and options: one question may
have many answer options, and one option may be available for many
questions. in this case, your options table should list all possible
options, one record for each option. then you'll need a third table to link
options to questions - each record will store one question/answer option
combination. so a question with three answer options will have three records
in this table; a question with five answer options will have five records in
this table, etc.

then you need a table to store the actual options chosen for each question,
by each survey respondent. you'll also need a table to store data that is
*survey-specific*, rather than answer-specific, because you don't want
repeating data stored with each answer on a particular completed survey. so
your answers table will include a foreign key pointing back to a particular
completed survey, and a foreign key pointing back to a particular question
option (that linking table described in the paragraph above). note that you
do *not* need a foreign key pointing to a record in the questions table,
because that key value is already stored in the linking table.

again, you really should study Duane Hookom's sample survey database to see
how this is all laid out and connected.

okay, that's basically the tables you'll need. migrating the data from a
spreadsheet format will be ugly, no doubt about it - i wouldn't be surprised
if it takes an entire day, or two, to do it. once you have the questions
table, options table, and question-options table set up, and the spreadsheet
data imported into its' own table, you should stop and back up your
database. that way if you mess up and have to start over, you have a clean
"original" to copy and begin again.

add a primary key field to the spreadsheet table (an autonumber field will
do fine). now each survey record is uniquely identified. use an Append query
to copy the *survey-specific* data into the surveys table, making sure that
you also append the primary key field.

you'll have to migrate each answer column in the spreadsheet table
*separately*. you're appending answer data from the spreadsheet table into
the answers table; make sure you include the primary key field in the
append. set criteria on the spreadsheet column you're working with: "yes"
if it's a Text data type, True if it's a Yes/No (True/False) data type.
manually enter the primary key value of the appropriate answer option from
the answer options table, to be appended to the answer field in the answers
table. the end result will be that, for the spreadsheet answer column you
working with, all surveys having a Yes answer to that question will be
migrated into the answers table using the appropriate key value from the
answer options table. for instance, out of the 6000 records in the
spreadsheet table, if 3010 answered yes in the first answer column, then the
query will append 3010 records into the answers table.

once you've migrated the first column of answer data from the spreadsheet
table into the answers table, stop and check it for accuracy. create a
Select query, linking the surveys table, the answers table, and the question
options table, and the questions table. pull the survey fields, and the
question option fields, and the questions fields into the grid, and view it
to make sure you're seeing the correct answer for the correct question, for
the appropriate number of survey records.

if you're only doing this migration once, i wouldn't bother writing a
separate Append query for each column and saving the query, etc. just write
an Append query to handle migrating the first answer column, run it, then
modify it to handle the second answer column, run it, etc.

this is hard to explain in the abstract, so i hope you at least got an idea
of how to do it, if not all the (fuzzy) details.

hth

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode