Unique Index / Need advice

**JString** · July 10th, 2008, 04:55 PM posted to microsoft.public.access.tablesdbdesign

Greetings all...

For my current project I've been assigned a database conversion job from the
old Lotus platform to Access. Needless to say, the existing database is a
number of years old. There are 3 main tables to work with, one of which
contains something around 180k records.

Unfortunately, none of these tables contain any unique identifier fields so
they can be very hard to work with. My first step was to filter out
duplicates and to add a GUID field, but I know there are still duplicate
accounts saved in these tables that have slightly different data.

So, I'm thinking I need to create a second unique index field that's based
off of key values so that I can group records by account and sort it all out.

Does anybody have any idea about how to do this?

Thanks in advance.

**Ryan** · July 10th, 2008, 06:26 PM posted to microsoft.public.access.tablesdbdesign

There is a wizard in Access to find duplicate entries. If you go to queries,
at the top of the database window there is a button called new. If you click
that button it brings up several wizard options, one being Find Duplicates
Query Wizard. Follow the steps in the wizard and you should be able to clean
up your database. Otherwise you could add a field to your table, make it an
autonumber, and then each record in your table would have a unique value and
you could build a query to sort and filter out duplicates.
--
Please remember to mark this post as answered if this solves your problem.

"JString" wrote:

Greetings all...

For my current project I've been assigned a database conversion job from the
old Lotus platform to Access. Needless to say, the existing database is a
number of years old. There are 3 main tables to work with, one of which
contains something around 180k records.

Unfortunately, none of these tables contain any unique identifier fields so
they can be very hard to work with. My first step was to filter out
duplicates and to add a GUID field, but I know there are still duplicate
accounts saved in these tables that have slightly different data.

So, I'm thinking I need to create a second unique index field that's based
off of key values so that I can group records by account and sort it all out.

Does anybody have any idea about how to do this?

Thanks in advance.

**Fred** · July 10th, 2008, 10:47 PM posted to microsoft.public.access.tablesdbdesign

Why not add an autonumber primary key as a starting point to do all of this
work, including finding duplication accounts. To start with, a record is
just a record (rather than an account) until you get it cleaned up. Maybe
then / later the account number could become your PK.

"JString" wrote:

Greetings all...

For my current project I've been assigned a database conversion job from the
old Lotus platform to Access. Needless to say, the existing database is a
number of years old. There are 3 main tables to work with, one of which
contains something around 180k records.

Unfortunately, none of these tables contain any unique identifier fields so
they can be very hard to work with. My first step was to filter out
duplicates and to add a GUID field, but I know there are still duplicate
accounts saved in these tables that have slightly different data.

So, I'm thinking I need to create a second unique index field that's based
off of key values so that I can group records by account and sort it all out.

Does anybody have any idea about how to do this?

Thanks in advance.

**JString** · July 11th, 2008, 05:45 AM posted to microsoft.public.access.tablesdbdesign

I tried the query wizard but it only allows for 10 fields to be compared
whereas my tables have 24 or so fields. Also it didn't work for some reason.

There's already a GUID field that I added to guard against further
duplicates. What I want to do is create an index with meaningful values and
with duplicates set to 'ok' so that I can gather all the records for a
particular case/account. For example, say I want to identify an account by
Plaintiff, Defendant, case number, and date. I'm thinking I should create a
field that combines a set number of characters from each of these fields in a
predetermined sequence. But it would have to be done in a way through which
it would be statistically unlikely that a duplicate value would be created
unless that particular record belongs to the same account. If this is
possible, after I sort out all of the garbage, the end result should be one
record for each account rather than several unorganized records for one
account.

"Fred" wrote:

Why not add an autonumber primary key as a starting point to do all of this
work, including finding duplication accounts. To start with, a record is
just a record (rather than an account) until you get it cleaned up. Maybe
then / later the account number could become your PK.

"JString" wrote:

Greetings all...

For my current project I've been assigned a database conversion job from the
old Lotus platform to Access. Needless to say, the existing database is a
number of years old. There are 3 main tables to work with, one of which
contains something around 180k records.

Unfortunately, none of these tables contain any unique identifier fields so
they can be very hard to work with. My first step was to filter out
duplicates and to add a GUID field, but I know there are still duplicate
accounts saved in these tables that have slightly different data.

So, I'm thinking I need to create a second unique index field that's based
off of key values so that I can group records by account and sort it all out.

Does anybody have any idea about how to do this?

Thanks in advance.

**Ryan** · July 11th, 2008, 02:53 PM posted to microsoft.public.access.tablesdbdesign

What you are wanting to do is concatenate fields. In your query you would
want to add a new field with this code.

JoinedFields: [Plaintiff]&""&[Defendant]&""&[CaseNumber]&""&[Date]

This will give you one value of all those fields joined together. You could
then do the duplicates wizard, or you could add autonumber field to your
table and in a query only show the autonumber field and this newly joined
field. Once you run this query you would know what number records to correct.
--
Please remember to mark this post as answered if this solves your problem.

"JString" wrote:

I tried the query wizard but it only allows for 10 fields to be compared
whereas my tables have 24 or so fields. Also it didn't work for some reason.

There's already a GUID field that I added to guard against further
duplicates. What I want to do is create an index with meaningful values and
with duplicates set to 'ok' so that I can gather all the records for a
particular case/account. For example, say I want to identify an account by
Plaintiff, Defendant, case number, and date. I'm thinking I should create a
field that combines a set number of characters from each of these fields in a
predetermined sequence. But it would have to be done in a way through which
it would be statistically unlikely that a duplicate value would be created
unless that particular record belongs to the same account. If this is
possible, after I sort out all of the garbage, the end result should be one
record for each account rather than several unorganized records for one
account.

"Fred" wrote:

Why not add an autonumber primary key as a starting point to do all of this
work, including finding duplication accounts. To start with, a record is
just a record (rather than an account) until you get it cleaned up. Maybe
then / later the account number could become your PK.

"JString" wrote:

Greetings all...

For my current project I've been assigned a database conversion job from the
old Lotus platform to Access. Needless to say, the existing database is a
number of years old. There are 3 main tables to work with, one of which
contains something around 180k records.

Unfortunately, none of these tables contain any unique identifier fields so
they can be very hard to work with. My first step was to filter out
duplicates and to add a GUID field, but I know there are still duplicate
accounts saved in these tables that have slightly different data.

So, I'm thinking I need to create a second unique index field that's based
off of key values so that I can group records by account and sort it all out.

Does anybody have any idea about how to do this?

Thanks in advance.

**JString** · July 11th, 2008, 04:26 PM posted to microsoft.public.access.tablesdbdesign

It looks like that might work. I'll give it a shot.

"Ryan" wrote:

What you are wanting to do is concatenate fields. In your query you would
want to add a new field with this code.

JoinedFields: [Plaintiff]&""&[Defendant]&""&[CaseNumber]&""&[Date]

This will give you one value of all those fields joined together. You could
then do the duplicates wizard, or you could add autonumber field to your
table and in a query only show the autonumber field and this newly joined
field. Once you run this query you would know what number records to correct.
--
Please remember to mark this post as answered if this solves your problem.

"JString" wrote:

I tried the query wizard but it only allows for 10 fields to be compared
whereas my tables have 24 or so fields. Also it didn't work for some reason.

There's already a GUID field that I added to guard against further
duplicates. What I want to do is create an index with meaningful values and
with duplicates set to 'ok' so that I can gather all the records for a
particular case/account. For example, say I want to identify an account by
Plaintiff, Defendant, case number, and date. I'm thinking I should create a
field that combines a set number of characters from each of these fields in a
predetermined sequence. But it would have to be done in a way through which
it would be statistically unlikely that a duplicate value would be created
unless that particular record belongs to the same account. If this is
possible, after I sort out all of the garbage, the end result should be one
record for each account rather than several unorganized records for one
account.

"Fred" wrote:

Why not add an autonumber primary key as a starting point to do all of this
work, including finding duplication accounts. To start with, a record is
just a record (rather than an account) until you get it cleaned up. Maybe
then / later the account number could become your PK.

"JString" wrote:

Greetings all...

For my current project I've been assigned a database conversion job from the
old Lotus platform to Access. Needless to say, the existing database is a
number of years old. There are 3 main tables to work with, one of which
contains something around 180k records.

Unfortunately, none of these tables contain any unique identifier fields so
they can be very hard to work with. My first step was to filter out
duplicates and to add a GUID field, but I know there are still duplicate
accounts saved in these tables that have slightly different data.

So, I'm thinking I need to create a second unique index field that's based
off of key values so that I can group records by account and sort it all out.

Does anybody have any idea about how to do this?

Thanks in advance.

**JString** · July 13th, 2008, 07:46 AM posted to microsoft.public.access.tablesdbdesign

Thanks for the tips everyone. I was finally able to put together a usable
list of duplicates.

The next thing I want to do is to create some sort of a index field that can
be used to guard against multiple records per account being entered into the
system in the future. The concatenated values would work, but the resulting
string would be a pretty large value to store. Are there any well known
methods out there for creating ID fields based on data in several fields?

**John W. Vinson** · July 14th, 2008, 12:26 AM posted to microsoft.public.access.tablesdbdesign

On Sat, 12 Jul 2008 23:46:00 -0700, JString
wrote:

Thanks for the tips everyone. I was finally able to put together a usable
list of duplicates.

The next thing I want to do is to create some sort of a index field that can
be used to guard against multiple records per account being entered into the
system in the future. The concatenated values would work, but the resulting
string would be a pretty large value to store. Are there any well known
methods out there for creating ID fields based on data in several fields?

Well... DON'T.

You can create a unique index on up to *ten* fields. Storing the data
redundantly in some sort of concatenated field is fraught with problems and
should only be done if there is *NO* better way, and there usually is a better
way!

To create a multifield index, use the Indexes tool in table design view (looks
like lightning hitting a datasheet, which I've wished would happen at
times...). Put a name for the index in the left column, and select the
fieldnames on that row and successive rows in the right column. Specify that
the index is unique using the checkbox on the tool.
--

John W. Vinson [MVP]

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode