If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
"Normalizing" imported data
Hello all -
I have a situation where I'm importing substantial amounts of data (30,000 to 120,000 rows at a pop) from three external sources through a text file or an Excel spreadsheet into three different data tables. I've established lookup tables for those fields that can reasonably be normalized between and amongst the data tables. The process I'm going through is: 1. Import the raw data into an "import" table that matches the structure of the source data. Also included in the import table are columns for foreign keys of 'normalizable' fields, which are set to 0 when the source data is imported. 2. Append any new lookup data that may be present in the source file to the lookup tables. 3. Run a series of update queries on the import table to update the foreign key fields with the keys of the lookup data. Depending on the source data file, there are between 3 and 7 of these update queries. 4. Append new records into the data table using only the foreign key values where applicable. I'm discovering that the update queries in step 3 are taking a LONG time to run (several minutes each), which is going to annoy my users to no end. My questions a - are there other, better processes or data structures to use? - is there a way of optimizing update queries? I appreciate any help or suggestions - Dave |
#2
|
|||
|
|||
"Normalizing" imported data
JOPO (just one person's opinion)...
Adding rows of data can slow down if your tables are indexed, as Access will need to check existing rows to enforce constraints and do the indexing. On the other hand, removing the index could mean loading bogus data. Would your users be willing to trade a longer load time for guarantees that they have good data? -- Regards Jeff Boyce Microsoft Access MVP Disclaimer: This author may have received products and services mentioned in this post. Mention and/or description of a product or service herein does not constitute endorsement thereof. Any code or pseudocode included in this post is offered "as is", with no guarantee as to suitability. You can thank the FTC of the USA for making this disclaimer possible/necessary. "pccdavef" u56849@uwe wrote in message news:a093c0a1fd378@uwe... Hello all - I have a situation where I'm importing substantial amounts of data (30,000 to 120,000 rows at a pop) from three external sources through a text file or an Excel spreadsheet into three different data tables. I've established lookup tables for those fields that can reasonably be normalized between and amongst the data tables. The process I'm going through is: 1. Import the raw data into an "import" table that matches the structure of the source data. Also included in the import table are columns for foreign keys of 'normalizable' fields, which are set to 0 when the source data is imported. 2. Append any new lookup data that may be present in the source file to the lookup tables. 3. Run a series of update queries on the import table to update the foreign key fields with the keys of the lookup data. Depending on the source data file, there are between 3 and 7 of these update queries. 4. Append new records into the data table using only the foreign key values where applicable. I'm discovering that the update queries in step 3 are taking a LONG time to run (several minutes each), which is going to annoy my users to no end. My questions a - are there other, better processes or data structures to use? - is there a way of optimizing update queries? I appreciate any help or suggestions - Dave |
#3
|
|||
|
|||
"Normalizing" imported data
hi Dave,
On 14.12.2009 14:32, pccdavef wrote: 3. Run a series of update queries on the import table to update the foreign key fields with the keys of the lookup data. Depending on the source data file, there are between 3 and 7 of these update queries. I'm discovering that the update queries in step 3 are taking a LONG time to run (several minutes each), which is going to annoy my users to no end. My questions a - are there other, better processes or data structures to use? - is there a way of optimizing update queries? I have to assume, that you don't have indexed these fields in the import table? So you're running up to seven queries against the whole tables. I simply would not update my import tables. Only insert the new records INSERT INTO [table] (fieldlist) SELECT fieldsFromImport, fieldsFromLookup FROM [importTable] INNER JOIN [lookupTable] ON condition Use one INSERT with all necessary JOINs. This should reduce at least the number of table reads from seven to one. mfG -- stefan -- |
#4
|
|||
|
|||
"Normalizing" imported data
Thanks Stefan -
I like that idea; I'll have to play with it some, but I think you're right - it's much more efficient. Just a couple of hellacious queries. Thanks! Dave Stefan Hoffmann wrote: hi Dave, 3. Run a series of update queries on the import table to update the foreign key fields with the keys of the lookup data. Depending on the source data [quoted text clipped - 6 lines] - are there other, better processes or data structures to use? - is there a way of optimizing update queries? I have to assume, that you don't have indexed these fields in the import table? So you're running up to seven queries against the whole tables. I simply would not update my import tables. Only insert the new records INSERT INTO [table] (fieldlist) SELECT fieldsFromImport, fieldsFromLookup FROM [importTable] INNER JOIN [lookupTable] ON condition Use one INSERT with all necessary JOINs. This should reduce at least the number of table reads from seven to one. mfG -- stefan -- |
#5
|
|||
|
|||
"Normalizing" imported data
I'm not sure how accurate this would be, or how applicable it would be to
your version of Access, but how about converting your query to VBA code? "pccdavef" wrote: Hello all - I have a situation where I'm importing substantial amounts of data (30,000 to 120,000 rows at a pop) from three external sources through a text file or an Excel spreadsheet into three different data tables. I've established lookup tables for those fields that can reasonably be normalized between and amongst the data tables. The process I'm going through is: 1. Import the raw data into an "import" table that matches the structure of the source data. Also included in the import table are columns for foreign keys of 'normalizable' fields, which are set to 0 when the source data is imported. 2. Append any new lookup data that may be present in the source file to the lookup tables. 3. Run a series of update queries on the import table to update the foreign key fields with the keys of the lookup data. Depending on the source data file, there are between 3 and 7 of these update queries. 4. Append new records into the data table using only the foreign key values where applicable. I'm discovering that the update queries in step 3 are taking a LONG time to run (several minutes each), which is going to annoy my users to no end. My questions a - are there other, better processes or data structures to use? - is there a way of optimizing update queries? I appreciate any help or suggestions - Dave . |
#6
|
|||
|
|||
"Normalizing" imported data
In your experience, does your VBA code run faster (I'm assuming a
record-by-record iteration through a recordset) than the set-level operations that Access queries are designed to provide? (I'm always looking for counter-examples and/or confirmation...) -- Regards Jeff Boyce Microsoft Access MVP Disclaimer: This author may have received products and services mentioned in this post. Mention and/or description of a product or service herein does not constitute endorsement thereof. Any code or pseudocode included in this post is offered "as is", with no guarantee as to suitability. You can thank the FTC of the USA for making this disclaimer possible/necessary. "rolaaus" wrote in message ... I'm not sure how accurate this would be, or how applicable it would be to your version of Access, but how about converting your query to VBA code? "pccdavef" wrote: Hello all - I have a situation where I'm importing substantial amounts of data (30,000 to 120,000 rows at a pop) from three external sources through a text file or an Excel spreadsheet into three different data tables. I've established lookup tables for those fields that can reasonably be normalized between and amongst the data tables. The process I'm going through is: 1. Import the raw data into an "import" table that matches the structure of the source data. Also included in the import table are columns for foreign keys of 'normalizable' fields, which are set to 0 when the source data is imported. 2. Append any new lookup data that may be present in the source file to the lookup tables. 3. Run a series of update queries on the import table to update the foreign key fields with the keys of the lookup data. Depending on the source data file, there are between 3 and 7 of these update queries. 4. Append new records into the data table using only the foreign key values where applicable. I'm discovering that the update queries in step 3 are taking a LONG time to run (several minutes each), which is going to annoy my users to no end. My questions a - are there other, better processes or data structures to use? - is there a way of optimizing update queries? I appreciate any help or suggestions - Dave . |
Thread Tools | |
Display Modes | |
|
|