A Microsoft Office (Excel, Word) forum. OfficeFrustration

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » OfficeFrustration forum » Microsoft Access » Database Design
Site Map Home Register Authors List Search Today's Posts Mark Forums Read  

"Normalizing" imported data



 
 
Thread Tools Display Modes
  #1  
Old December 14th, 2009, 02:32 PM posted to microsoft.public.access.tablesdbdesign
pccdavef
external usenet poster
 
Posts: 2
Default "Normalizing" imported data

Hello all -

I have a situation where I'm importing substantial amounts of data (30,000 to
120,000 rows at a pop) from three external sources through a text file or an
Excel spreadsheet into three different data tables. I've established lookup
tables for those fields that can reasonably be normalized between and amongst
the data tables.

The process I'm going through is:

1. Import the raw data into an "import" table that matches the structure of
the source data. Also included in the import table are columns for foreign
keys of 'normalizable' fields, which are set to 0 when the source data is
imported.

2. Append any new lookup data that may be present in the source file to the
lookup tables.

3. Run a series of update queries on the import table to update the foreign
key fields with the keys of the lookup data. Depending on the source data
file, there are between 3 and 7 of these update queries.

4. Append new records into the data table using only the foreign key values
where applicable.

I'm discovering that the update queries in step 3 are taking a LONG time to
run (several minutes each), which is going to annoy my users to no end.

My questions a
- are there other, better processes or data structures to use?
- is there a way of optimizing update queries?

I appreciate any help or suggestions -
Dave

  #2  
Old December 14th, 2009, 02:49 PM posted to microsoft.public.access.tablesdbdesign
Jeff Boyce
external usenet poster
 
Posts: 1,555
Default "Normalizing" imported data

JOPO (just one person's opinion)...

Adding rows of data can slow down if your tables are indexed, as Access will
need to check existing rows to enforce constraints and do the indexing.

On the other hand, removing the index could mean loading bogus data.

Would your users be willing to trade a longer load time for guarantees that
they have good data?

--

Regards

Jeff Boyce
Microsoft Access MVP

Disclaimer: This author may have received products and services mentioned in
this post. Mention and/or description of a product or service herein does
not constitute endorsement thereof.

Any code or pseudocode included in this post is offered "as is", with no
guarantee as to suitability.

You can thank the FTC of the USA for making this disclaimer
possible/necessary.



"pccdavef" u56849@uwe wrote in message news:a093c0a1fd378@uwe...
Hello all -

I have a situation where I'm importing substantial amounts of data (30,000
to
120,000 rows at a pop) from three external sources through a text file or
an
Excel spreadsheet into three different data tables. I've established
lookup
tables for those fields that can reasonably be normalized between and
amongst
the data tables.

The process I'm going through is:

1. Import the raw data into an "import" table that matches the structure
of
the source data. Also included in the import table are columns for foreign
keys of 'normalizable' fields, which are set to 0 when the source data is
imported.

2. Append any new lookup data that may be present in the source file to
the
lookup tables.

3. Run a series of update queries on the import table to update the
foreign
key fields with the keys of the lookup data. Depending on the source data
file, there are between 3 and 7 of these update queries.

4. Append new records into the data table using only the foreign key
values
where applicable.

I'm discovering that the update queries in step 3 are taking a LONG time
to
run (several minutes each), which is going to annoy my users to no end.

My questions a
- are there other, better processes or data structures to use?
- is there a way of optimizing update queries?

I appreciate any help or suggestions -
Dave



  #3  
Old December 14th, 2009, 03:08 PM posted to microsoft.public.access.tablesdbdesign
Stefan Hoffmann
external usenet poster
 
Posts: 991
Default "Normalizing" imported data

hi Dave,

On 14.12.2009 14:32, pccdavef wrote:
3. Run a series of update queries on the import table to update the foreign
key fields with the keys of the lookup data. Depending on the source data
file, there are between 3 and 7 of these update queries.

I'm discovering that the update queries in step 3 are taking a LONG time to
run (several minutes each), which is going to annoy my users to no end.

My questions a
- are there other, better processes or data structures to use?
- is there a way of optimizing update queries?

I have to assume, that you don't have indexed these fields in the import
table? So you're running up to seven queries against the whole tables.

I simply would not update my import tables. Only insert the new records

INSERT INTO [table] (fieldlist)
SELECT fieldsFromImport, fieldsFromLookup
FROM [importTable]
INNER JOIN [lookupTable] ON condition

Use one INSERT with all necessary JOINs. This should reduce at least the
number of table reads from seven to one.

mfG
-- stefan --
  #4  
Old December 14th, 2009, 03:46 PM posted to microsoft.public.access.tablesdbdesign
pccdavef
external usenet poster
 
Posts: 2
Default "Normalizing" imported data

Thanks Stefan -

I like that idea; I'll have to play with it some, but I think you're right -
it's much more efficient. Just a couple of hellacious queries.

Thanks!
Dave

Stefan Hoffmann wrote:
hi Dave,

3. Run a series of update queries on the import table to update the foreign
key fields with the keys of the lookup data. Depending on the source data

[quoted text clipped - 6 lines]
- are there other, better processes or data structures to use?
- is there a way of optimizing update queries?

I have to assume, that you don't have indexed these fields in the import
table? So you're running up to seven queries against the whole tables.

I simply would not update my import tables. Only insert the new records

INSERT INTO [table] (fieldlist)
SELECT fieldsFromImport, fieldsFromLookup
FROM [importTable]
INNER JOIN [lookupTable] ON condition

Use one INSERT with all necessary JOINs. This should reduce at least the
number of table reads from seven to one.

mfG
-- stefan --


  #5  
Old December 15th, 2009, 09:01 AM posted to microsoft.public.access.tablesdbdesign
rolaaus[_2_]
external usenet poster
 
Posts: 16
Default "Normalizing" imported data

I'm not sure how accurate this would be, or how applicable it would be to
your version of Access, but how about converting your query to VBA code?

"pccdavef" wrote:

Hello all -

I have a situation where I'm importing substantial amounts of data (30,000 to
120,000 rows at a pop) from three external sources through a text file or an
Excel spreadsheet into three different data tables. I've established lookup
tables for those fields that can reasonably be normalized between and amongst
the data tables.

The process I'm going through is:

1. Import the raw data into an "import" table that matches the structure of
the source data. Also included in the import table are columns for foreign
keys of 'normalizable' fields, which are set to 0 when the source data is
imported.

2. Append any new lookup data that may be present in the source file to the
lookup tables.

3. Run a series of update queries on the import table to update the foreign
key fields with the keys of the lookup data. Depending on the source data
file, there are between 3 and 7 of these update queries.

4. Append new records into the data table using only the foreign key values
where applicable.

I'm discovering that the update queries in step 3 are taking a LONG time to
run (several minutes each), which is going to annoy my users to no end.

My questions a
- are there other, better processes or data structures to use?
- is there a way of optimizing update queries?

I appreciate any help or suggestions -
Dave

.

  #6  
Old December 15th, 2009, 02:43 PM posted to microsoft.public.access.tablesdbdesign
Jeff Boyce
external usenet poster
 
Posts: 1,555
Default "Normalizing" imported data

In your experience, does your VBA code run faster (I'm assuming a
record-by-record iteration through a recordset) than the set-level
operations that Access queries are designed to provide?

(I'm always looking for counter-examples and/or confirmation...)

--

Regards

Jeff Boyce
Microsoft Access MVP

Disclaimer: This author may have received products and services mentioned in
this post. Mention and/or description of a product or service herein does
not constitute endorsement thereof.

Any code or pseudocode included in this post is offered "as is", with no
guarantee as to suitability.

You can thank the FTC of the USA for making this disclaimer
possible/necessary.

"rolaaus" wrote in message
...
I'm not sure how accurate this would be, or how applicable it would be to
your version of Access, but how about converting your query to VBA code?

"pccdavef" wrote:

Hello all -

I have a situation where I'm importing substantial amounts of data
(30,000 to
120,000 rows at a pop) from three external sources through a text file or
an
Excel spreadsheet into three different data tables. I've established
lookup
tables for those fields that can reasonably be normalized between and
amongst
the data tables.

The process I'm going through is:

1. Import the raw data into an "import" table that matches the structure
of
the source data. Also included in the import table are columns for
foreign
keys of 'normalizable' fields, which are set to 0 when the source data is
imported.

2. Append any new lookup data that may be present in the source file to
the
lookup tables.

3. Run a series of update queries on the import table to update the
foreign
key fields with the keys of the lookup data. Depending on the source data
file, there are between 3 and 7 of these update queries.

4. Append new records into the data table using only the foreign key
values
where applicable.

I'm discovering that the update queries in step 3 are taking a LONG time
to
run (several minutes each), which is going to annoy my users to no end.

My questions a
- are there other, better processes or data structures to use?
- is there a way of optimizing update queries?

I appreciate any help or suggestions -
Dave

.



 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 10:33 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 OfficeFrustration.
The comments are property of their posters.