A Microsoft Office (Excel, Word) forum. OfficeFrustration

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » OfficeFrustration forum » Microsoft Access » Running & Setting Up Queries
Site Map Home Register Authors List Search Today's Posts Mark Forums Read  

Duplicate values



 
 
Thread Tools Display Modes
  #1  
Old July 8th, 2008, 01:22 PM posted to microsoft.public.access.queries
Nuno Gomes
external usenet poster
 
Posts: 23
Default Duplicate values

Hello, all


I have a table that have +/- 40000 records of duplicate entries.

I know i need the record more recent of the duplicates (i have the field
DATA_V to order the records).
My question is how i delete all duplicates less the more recent record?
Or any other way to solve this problem?...


There is the exemple:
My table: PVENDAS
MATRICULA, MARCA, ...., DATA_V, ...
01-05-DF VOLVO 20050105
01-05-DF VOLVO 20050915
01-05-DF VOLVO 20060728
01-05-DF VOLVO 20080312
55-US-01 OPEL 20060502
55-US-01 OPEL 20071108
55-US-01 OPEL 20080705
....

I like to have, in final, my table: PVENDAS
MATRICULA, MARCA, ...., DATA_V, ...
01-05-DF VOLVO 20080312

55-US-01 OPEL 20080705
....




Thank you for all your help.

Nuno Gomes


  #2  
Old July 8th, 2008, 02:54 PM posted to microsoft.public.access.queries
Allen Browne
external usenet poster
 
Posts: 11,706
Default Duplicate values

Backup first!

DELETE FROM PVENDAS
WHERE EXISTS
(SELECT MATRICULA
FROM PVENDAS AS T
WHERE T.MATRICULA = PVENDAS.MATRICULA
AND T.MARCA = PVENDAS.MARCA
AND T.DATA_V PVENDAS.DATA_V);

If subqueries are new, here's an introduction:
http://allenbrowne.com/subquery-01.html

--
Allen Browne - Microsoft MVP. Perth, Western Australia
Tips for Access users - http://allenbrowne.com/tips.html
Reply to group, rather than allenbrowne at mvps dot org.

"Nuno Gomes" wrote in message
...

I have a table that have +/- 40000 records of duplicate entries.

I know i need the record more recent of the duplicates (i have the field
DATA_V to order the records).
My question is how i delete all duplicates less the more recent record?
Or any other way to solve this problem?...


There is the exemple:
My table: PVENDAS
MATRICULA, MARCA, ...., DATA_V, ...
01-05-DF VOLVO 20050105
01-05-DF VOLVO 20050915
01-05-DF VOLVO 20060728
01-05-DF VOLVO 20080312
55-US-01 OPEL 20060502
55-US-01 OPEL 20071108
55-US-01 OPEL 20080705
...

I like to have, in final, my table: PVENDAS
MATRICULA, MARCA, ...., DATA_V, ...
01-05-DF VOLVO 20080312
55-US-01 OPEL 20080705

  #3  
Old July 8th, 2008, 03:05 PM posted to microsoft.public.access.queries
Michel Walsh[_2_]
external usenet poster
 
Posts: 56
Default Duplicate values

40K records? Maybe standard joins will be slow, so I will suggest you make a
temporary table, with the three fields, BUT, add an index built on the two
first fields NOT allowing duplicated value. (To define such an index, in the
index form of the table design, supply a name for the index in the first
line and LEAVE that column empty in the second line, second line getting the
second field making that 'compound' index... be sure to check the option
that the index should not allow dup :-) ).


Next, append the data from your initial table to this temp table with an
ORDER BY clause. Jet respects that order by clause (but that is not
documented, so it may be considered as susceptible to change in the future).
The ORDER BY clause must be by DECREASING value of the third field:


INSERT INTO temp
SELECT f1, f2, f3
FROM originalTable
ORDER BY f3 DESC



Execution of that query from the user interface will prompt you that some
records have not been appended due to uniqueness constraint violation...
that is what we want.


The result is then in the temp table.


Remember to define the no dup index on couple { f1, f2}, else, all records
will be appended.

Remember to use the ORDER BY f3 DESC, so the maximum value of f3, for each
{ f1, f2} will be the first one to 'make it' to the table and the smaller
value of f3 will be rejected, for the same { f1, f2} .


Vanderghast, Access MVP


"Nuno Gomes" wrote in message
...
Hello, all


I have a table that have +/- 40000 records of duplicate entries.

I know i need the record more recent of the duplicates (i have the field
DATA_V to order the records).
My question is how i delete all duplicates less the more recent record?
Or any other way to solve this problem?...


There is the exemple:
My table: PVENDAS
MATRICULA, MARCA, ...., DATA_V, ...
01-05-DF VOLVO 20050105
01-05-DF VOLVO 20050915
01-05-DF VOLVO 20060728
01-05-DF VOLVO 20080312
55-US-01 OPEL 20060502
55-US-01 OPEL 20071108
55-US-01 OPEL 20080705
...

I like to have, in final, my table: PVENDAS
MATRICULA, MARCA, ...., DATA_V, ...
01-05-DF VOLVO 20080312

55-US-01 OPEL 20080705
...




Thank you for all your help.

Nuno Gomes



 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 11:46 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 OfficeFrustration.
The comments are property of their posters.