If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Unique Index / Need advice
Greetings all...
For my current project I've been assigned a database conversion job from the old Lotus platform to Access. Needless to say, the existing database is a number of years old. There are 3 main tables to work with, one of which contains something around 180k records. Unfortunately, none of these tables contain any unique identifier fields so they can be very hard to work with. My first step was to filter out duplicates and to add a GUID field, but I know there are still duplicate accounts saved in these tables that have slightly different data. So, I'm thinking I need to create a second unique index field that's based off of key values so that I can group records by account and sort it all out. Does anybody have any idea about how to do this? Thanks in advance. |
#2
|
|||
|
|||
Unique Index / Need advice
There is a wizard in Access to find duplicate entries. If you go to queries,
at the top of the database window there is a button called new. If you click that button it brings up several wizard options, one being Find Duplicates Query Wizard. Follow the steps in the wizard and you should be able to clean up your database. Otherwise you could add a field to your table, make it an autonumber, and then each record in your table would have a unique value and you could build a query to sort and filter out duplicates. -- Please remember to mark this post as answered if this solves your problem. "JString" wrote: Greetings all... For my current project I've been assigned a database conversion job from the old Lotus platform to Access. Needless to say, the existing database is a number of years old. There are 3 main tables to work with, one of which contains something around 180k records. Unfortunately, none of these tables contain any unique identifier fields so they can be very hard to work with. My first step was to filter out duplicates and to add a GUID field, but I know there are still duplicate accounts saved in these tables that have slightly different data. So, I'm thinking I need to create a second unique index field that's based off of key values so that I can group records by account and sort it all out. Does anybody have any idea about how to do this? Thanks in advance. |
#3
|
|||
|
|||
Unique Index / Need advice
Why not add an autonumber primary key as a starting point to do all of this
work, including finding duplication accounts. To start with, a record is just a record (rather than an account) until you get it cleaned up. Maybe then / later the account number could become your PK. "JString" wrote: Greetings all... For my current project I've been assigned a database conversion job from the old Lotus platform to Access. Needless to say, the existing database is a number of years old. There are 3 main tables to work with, one of which contains something around 180k records. Unfortunately, none of these tables contain any unique identifier fields so they can be very hard to work with. My first step was to filter out duplicates and to add a GUID field, but I know there are still duplicate accounts saved in these tables that have slightly different data. So, I'm thinking I need to create a second unique index field that's based off of key values so that I can group records by account and sort it all out. Does anybody have any idea about how to do this? Thanks in advance. |
#4
|
|||
|
|||
Unique Index / Need advice
I tried the query wizard but it only allows for 10 fields to be compared
whereas my tables have 24 or so fields. Also it didn't work for some reason. There's already a GUID field that I added to guard against further duplicates. What I want to do is create an index with meaningful values and with duplicates set to 'ok' so that I can gather all the records for a particular case/account. For example, say I want to identify an account by Plaintiff, Defendant, case number, and date. I'm thinking I should create a field that combines a set number of characters from each of these fields in a predetermined sequence. But it would have to be done in a way through which it would be statistically unlikely that a duplicate value would be created unless that particular record belongs to the same account. If this is possible, after I sort out all of the garbage, the end result should be one record for each account rather than several unorganized records for one account. "Fred" wrote: Why not add an autonumber primary key as a starting point to do all of this work, including finding duplication accounts. To start with, a record is just a record (rather than an account) until you get it cleaned up. Maybe then / later the account number could become your PK. "JString" wrote: Greetings all... For my current project I've been assigned a database conversion job from the old Lotus platform to Access. Needless to say, the existing database is a number of years old. There are 3 main tables to work with, one of which contains something around 180k records. Unfortunately, none of these tables contain any unique identifier fields so they can be very hard to work with. My first step was to filter out duplicates and to add a GUID field, but I know there are still duplicate accounts saved in these tables that have slightly different data. So, I'm thinking I need to create a second unique index field that's based off of key values so that I can group records by account and sort it all out. Does anybody have any idea about how to do this? Thanks in advance. |
#5
|
|||
|
|||
Unique Index / Need advice
What you are wanting to do is concatenate fields. In your query you would
want to add a new field with this code. JoinedFields: [Plaintiff]&""&[Defendant]&""&[CaseNumber]&""&[Date] This will give you one value of all those fields joined together. You could then do the duplicates wizard, or you could add autonumber field to your table and in a query only show the autonumber field and this newly joined field. Once you run this query you would know what number records to correct. -- Please remember to mark this post as answered if this solves your problem. "JString" wrote: I tried the query wizard but it only allows for 10 fields to be compared whereas my tables have 24 or so fields. Also it didn't work for some reason. There's already a GUID field that I added to guard against further duplicates. What I want to do is create an index with meaningful values and with duplicates set to 'ok' so that I can gather all the records for a particular case/account. For example, say I want to identify an account by Plaintiff, Defendant, case number, and date. I'm thinking I should create a field that combines a set number of characters from each of these fields in a predetermined sequence. But it would have to be done in a way through which it would be statistically unlikely that a duplicate value would be created unless that particular record belongs to the same account. If this is possible, after I sort out all of the garbage, the end result should be one record for each account rather than several unorganized records for one account. "Fred" wrote: Why not add an autonumber primary key as a starting point to do all of this work, including finding duplication accounts. To start with, a record is just a record (rather than an account) until you get it cleaned up. Maybe then / later the account number could become your PK. "JString" wrote: Greetings all... For my current project I've been assigned a database conversion job from the old Lotus platform to Access. Needless to say, the existing database is a number of years old. There are 3 main tables to work with, one of which contains something around 180k records. Unfortunately, none of these tables contain any unique identifier fields so they can be very hard to work with. My first step was to filter out duplicates and to add a GUID field, but I know there are still duplicate accounts saved in these tables that have slightly different data. So, I'm thinking I need to create a second unique index field that's based off of key values so that I can group records by account and sort it all out. Does anybody have any idea about how to do this? Thanks in advance. |
#6
|
|||
|
|||
Unique Index / Need advice
It looks like that might work. I'll give it a shot.
"Ryan" wrote: What you are wanting to do is concatenate fields. In your query you would want to add a new field with this code. JoinedFields: [Plaintiff]&""&[Defendant]&""&[CaseNumber]&""&[Date] This will give you one value of all those fields joined together. You could then do the duplicates wizard, or you could add autonumber field to your table and in a query only show the autonumber field and this newly joined field. Once you run this query you would know what number records to correct. -- Please remember to mark this post as answered if this solves your problem. "JString" wrote: I tried the query wizard but it only allows for 10 fields to be compared whereas my tables have 24 or so fields. Also it didn't work for some reason. There's already a GUID field that I added to guard against further duplicates. What I want to do is create an index with meaningful values and with duplicates set to 'ok' so that I can gather all the records for a particular case/account. For example, say I want to identify an account by Plaintiff, Defendant, case number, and date. I'm thinking I should create a field that combines a set number of characters from each of these fields in a predetermined sequence. But it would have to be done in a way through which it would be statistically unlikely that a duplicate value would be created unless that particular record belongs to the same account. If this is possible, after I sort out all of the garbage, the end result should be one record for each account rather than several unorganized records for one account. "Fred" wrote: Why not add an autonumber primary key as a starting point to do all of this work, including finding duplication accounts. To start with, a record is just a record (rather than an account) until you get it cleaned up. Maybe then / later the account number could become your PK. "JString" wrote: Greetings all... For my current project I've been assigned a database conversion job from the old Lotus platform to Access. Needless to say, the existing database is a number of years old. There are 3 main tables to work with, one of which contains something around 180k records. Unfortunately, none of these tables contain any unique identifier fields so they can be very hard to work with. My first step was to filter out duplicates and to add a GUID field, but I know there are still duplicate accounts saved in these tables that have slightly different data. So, I'm thinking I need to create a second unique index field that's based off of key values so that I can group records by account and sort it all out. Does anybody have any idea about how to do this? Thanks in advance. |
#7
|
|||
|
|||
Unique Index / Need advice
Thanks for the tips everyone. I was finally able to put together a usable
list of duplicates. The next thing I want to do is to create some sort of a index field that can be used to guard against multiple records per account being entered into the system in the future. The concatenated values would work, but the resulting string would be a pretty large value to store. Are there any well known methods out there for creating ID fields based on data in several fields? |
#8
|
|||
|
|||
Unique Index / Need advice
On Sat, 12 Jul 2008 23:46:00 -0700, JString
wrote: Thanks for the tips everyone. I was finally able to put together a usable list of duplicates. The next thing I want to do is to create some sort of a index field that can be used to guard against multiple records per account being entered into the system in the future. The concatenated values would work, but the resulting string would be a pretty large value to store. Are there any well known methods out there for creating ID fields based on data in several fields? Well... DON'T. You can create a unique index on up to *ten* fields. Storing the data redundantly in some sort of concatenated field is fraught with problems and should only be done if there is *NO* better way, and there usually is a better way! To create a multifield index, use the Indexes tool in table design view (looks like lightning hitting a datasheet, which I've wished would happen at times...). Put a name for the index in the left column, and select the fieldnames on that row and successive rows in the right column. Specify that the index is unique using the checkbox on the tool. -- John W. Vinson [MVP] |
Thread Tools | |
Display Modes | |
|
|