A Microsoft Office (Excel, Word) forum. OfficeFrustration

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » OfficeFrustration forum » Microsoft Access » Database Design
Site Map Home Register Authors List Search Today's Posts Mark Forums Read  

Reduce size of data



 
 
Thread Tools Display Modes
  #1  
Old February 8th, 2006, 04:32 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

I have the assignment to imporve a database's performance.

The data is in a separate back end database that has just under 3 million
rows. It has no index. All queries are running as table scans, of course.

I want to add some indexes that should fix this nicely. But there's no
room. It's almost 2 gigabytes now.

I've found that several columns are much larger than necessary. Here are
the stats:

Datatype Length Number Change
text 50 1 15
decimal 18 27 7
decimal 18 2
date/time 1
integer 2

The above Number column is the number of such columns in the table. I am
expecting to impact the size of the table by changing a 50 character text
column to 15 characters, and by reducing the precision of the 29 decimal
columns to 7.

I could not do this by just changing the datatypes on a full table.
Something about not enough memory. (I like Jet less and less. I would
never have this kind of mess with MSDE!)

So, I made a copy of the table, cleared out all the records, compacted this,
and made the changes to the 27 decimal columns. The table was just over
100K bytes in size. I then linked to the database containing the copy of
the data and created an append query to put the data in.

When this finished, I had saved less than 100K bytes, a fraction of 1% of
the data size. Does not the size of a Decimal datatype depend on the
precision? If not, then why is there a precision. If you're going to store
10 bytes all the time, why not allow them all to be used to hold data?
Allowing two digits (nibbles) per byte, a decimal of precision 18 should
require 10 bytes to store (add one nibble for a sign and divide by two,
rounding up). I chose 7, although 6 would have been sufficient for all
existing data, since I expect it's going to add one for a sign anyway, so 6
+ 1 = 7 would be 4 bytes, and 7 + 1 = 8 would be 4 bytes anyway, so there
would be no savings there. Just my theory.

Anyway, after completing the append, the table had reduced in size by less
than 100K bytes (out of just under 2 GB). By my calculation, I was
expecting a reduction of just over 50%. Does Jet not store the decimal
datatype in varying sized pieces, according to the precision? Is my formula
of (Precision + 1) / 2 not what you would expect?

I then repeated all the steps, reducing the text column from 50 to 15. This
I expected would save 105MB (35 bytes X 3 million). Bottom line: the file
is now larger than ever!

Does anyone know how a database can be reduced in size to accomplish this?

Tom Ellison




  #2  
Old February 8th, 2006, 06:31 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

Tried another approach. Created a new database and typed in the table's
entire definition, but with the reduced sizes. Still not smaller. I'm
expecting it to be half the size, but almost no change. Not even 1%.

Tom Ellison


"Tom Ellison" wrote in message
...
I have the assignment to imporve a database's performance.

The data is in a separate back end database that has just under 3 million
rows. It has no index. All queries are running as table scans, of
course.

I want to add some indexes that should fix this nicely. But there's no
room. It's almost 2 gigabytes now.

I've found that several columns are much larger than necessary. Here are
the stats:

Datatype Length Number Change
text 50 1 15
decimal 18 27 7
decimal 18 2
date/time 1
integer 2

The above Number column is the number of such columns in the table. I am
expecting to impact the size of the table by changing a 50 character text
column to 15 characters, and by reducing the precision of the 29 decimal
columns to 7.

I could not do this by just changing the datatypes on a full table.
Something about not enough memory. (I like Jet less and less. I would
never have this kind of mess with MSDE!)

So, I made a copy of the table, cleared out all the records, compacted
this, and made the changes to the 27 decimal columns. The table was just
over 100K bytes in size. I then linked to the database containing the
copy of the data and created an append query to put the data in.

When this finished, I had saved less than 100K bytes, a fraction of 1% of
the data size. Does not the size of a Decimal datatype depend on the
precision? If not, then why is there a precision. If you're going to
store 10 bytes all the time, why not allow them all to be used to hold
data? Allowing two digits (nibbles) per byte, a decimal of precision 18
should require 10 bytes to store (add one nibble for a sign and divide by
two, rounding up). I chose 7, although 6 would have been sufficient for
all existing data, since I expect it's going to add one for a sign anyway,
so 6 + 1 = 7 would be 4 bytes, and 7 + 1 = 8 would be 4 bytes anyway, so
there would be no savings there. Just my theory.

Anyway, after completing the append, the table had reduced in size by less
than 100K bytes (out of just under 2 GB). By my calculation, I was
expecting a reduction of just over 50%. Does Jet not store the decimal
datatype in varying sized pieces, according to the precision? Is my
formula of (Precision + 1) / 2 not what you would expect?

I then repeated all the steps, reducing the text column from 50 to 15.
This I expected would save 105MB (35 bytes X 3 million). Bottom line:
the file is now larger than ever!

Does anyone know how a database can be reduced in size to accomplish this?

Tom Ellison






  #3  
Old February 8th, 2006, 06:50 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

I changed them to "single" and it cut the file size by 75%. Go figure!

Tom Ellison


"Tom Ellison" wrote in message
...
I have the assignment to imporve a database's performance.

The data is in a separate back end database that has just under 3 million
rows. It has no index. All queries are running as table scans, of
course.

I want to add some indexes that should fix this nicely. But there's no
room. It's almost 2 gigabytes now.

I've found that several columns are much larger than necessary. Here are
the stats:

Datatype Length Number Change
text 50 1 15
decimal 18 27 7
decimal 18 2
date/time 1
integer 2

The above Number column is the number of such columns in the table. I am
expecting to impact the size of the table by changing a 50 character text
column to 15 characters, and by reducing the precision of the 29 decimal
columns to 7.

I could not do this by just changing the datatypes on a full table.
Something about not enough memory. (I like Jet less and less. I would
never have this kind of mess with MSDE!)

So, I made a copy of the table, cleared out all the records, compacted
this, and made the changes to the 27 decimal columns. The table was just
over 100K bytes in size. I then linked to the database containing the
copy of the data and created an append query to put the data in.

When this finished, I had saved less than 100K bytes, a fraction of 1% of
the data size. Does not the size of a Decimal datatype depend on the
precision? If not, then why is there a precision. If you're going to
store 10 bytes all the time, why not allow them all to be used to hold
data? Allowing two digits (nibbles) per byte, a decimal of precision 18
should require 10 bytes to store (add one nibble for a sign and divide by
two, rounding up). I chose 7, although 6 would have been sufficient for
all existing data, since I expect it's going to add one for a sign anyway,
so 6 + 1 = 7 would be 4 bytes, and 7 + 1 = 8 would be 4 bytes anyway, so
there would be no savings there. Just my theory.

Anyway, after completing the append, the table had reduced in size by less
than 100K bytes (out of just under 2 GB). By my calculation, I was
expecting a reduction of just over 50%. Does Jet not store the decimal
datatype in varying sized pieces, according to the precision? Is my
formula of (Precision + 1) / 2 not what you would expect?

I then repeated all the steps, reducing the text column from 50 to 15.
This I expected would save 105MB (35 bytes X 3 million). Bottom line:
the file is now larger than ever!

Does anyone know how a database can be reduced in size to accomplish this?

Tom Ellison






  #4  
Old February 8th, 2006, 07:12 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

AFAIK Jet stores Decimals like dBase does: no nibbles, but one byte per
digit, so at least 18 bytes for a decimal of precision 18. However, if
you only need 7 digits precision you could replace your Decimals with
Longs and hard-code the position of the decimal point.

OTOH Jet does not store empty space if a Text field is not full: a
10-character string will occupy the same amount of space whether the
field size is 10 or 255. (This of course means storing a size and a
pointer for each value.)

Bottom line: you're very near the limits of what Jet's designed for, and
some flavour of SQL client/server looks like the sensible way to go.


On Tue, 7 Feb 2006 22:32:54 -0600, "Tom Ellison"
wrote:

Allowing two digits (nibbles) per byte, a decimal of precision 18 should
require 10 bytes to store (add one nibble for a sign and divide by two,
rounding up). I chose 7, although 6 would have been sufficient for all
existing data, since I expect it's going to add one for a sign anyway, so 6
+ 1 = 7 would be 4 bytes, and 7 + 1 = 8 would be 4 bytes anyway, so there
would be no savings there. Just my theory.


--
John Nurick [Microsoft Access MVP]

Please respond in the newgroup and not by email.

  #5  
Old February 8th, 2006, 07:31 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

Another good reason to never use the Decimal data type in JET.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.
Tips for Access users - http://allenbrowne.com/tips.html
Reply to group, rather than allenbrowne at mvps dot org.

"Tom Ellison" wrote in message
...
I changed them to "single" and it cut the file size by 75%. Go figure!

Tom Ellison



  #6  
Old February 8th, 2006, 07:40 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

Dear John:

MSDE has same limits, but probably better storage algorithms. I'm not sure
there would be enough benefit to switching to MSDE. Client will not upgrade
to full SQL Server. Expense of server OS, SQL Server, plus server hardware
would be nearly $4K I expect. No where close to what he would spend.

I've used the same procedure in changing to a Single datatype, and the
database dropped by 80% in size. Go figure! It didn't drop even 1%
changing from precision 18 to 7. No explanation from me!

So, as you can see, I had already discussed SQL Server as an alternative.
It's one I certainly like.

My initial results are in. Indexing just one numeric column causes a 83%
increase in performance. Pretty good start. Maybe enough to finish the
project. I just hope Single datatype isn't a problem. It's not dollars and
cents, but some kind of ratios and such. Probably this will work, though.
I'll have to explain why 9.17 can result in records that show 9.17,
although it's stored as 9.169993. Life is a series of trade-offs, eh?

I understand about the text pointer thing. This app doesn't search for that
text column. But surely the area where the text is stored makes the overall
database sensitive to the length of the strings. But I guess that's the
actual length, not the maximum allocated length. So now that makes sense
also. If the text were all padded to full length with spaces, that would be
different then.

Thanks for your participation.

Tom Ellison


"John Nurick" wrote in message
...
AFAIK Jet stores Decimals like dBase does: no nibbles, but one byte per
digit, so at least 18 bytes for a decimal of precision 18. However, if
you only need 7 digits precision you could replace your Decimals with
Longs and hard-code the position of the decimal point.

OTOH Jet does not store empty space if a Text field is not full: a
10-character string will occupy the same amount of space whether the
field size is 10 or 255. (This of course means storing a size and a
pointer for each value.)

Bottom line: you're very near the limits of what Jet's designed for, and
some flavour of SQL client/server looks like the sensible way to go.


On Tue, 7 Feb 2006 22:32:54 -0600, "Tom Ellison"
wrote:

Allowing two digits (nibbles) per byte, a decimal of precision 18 should
require 10 bytes to store (add one nibble for a sign and divide by two,
rounding up). I chose 7, although 6 would have been sufficient for all
existing data, since I expect it's going to add one for a sign anyway, so
6
+ 1 = 7 would be 4 bytes, and 7 + 1 = 8 would be 4 bytes anyway, so there
would be no savings there. Just my theory.


--
John Nurick [Microsoft Access MVP]

Please respond in the newgroup and not by email.



  #7  
Old February 8th, 2006, 07:47 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

Dear Allen:

I had never run into it before. It's "new" isn't it. I mean, I knew it was
there since the 2000 version, but I've never had to work with it. Was it in
'97? That's the second version of Access I learned, and the first I learned
thoroughly (oh for the good ole days).

The equivalent in MSDE is very useful. I like it. Data size increases one
byte at a time, and it even tells you the size of storage in bytes when
you're working on it. That's where I got my idea of how I thought this
should work. Apparently, it doesn't work that way at all.

Thanks again, Allen. I'm absorbing as fast as I can tonight.

Tom Ellison


"Allen Browne" wrote in message
...
Another good reason to never use the Decimal data type in JET.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.
Tips for Access users - http://allenbrowne.com/tips.html
Reply to group, rather than allenbrowne at mvps dot org.

"Tom Ellison" wrote in message
...
I changed them to "single" and it cut the file size by 75%. Go figure!

Tom Ellison





  #8  
Old February 8th, 2006, 08:29 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

Correct: JET 3.x (Access 97) did not have the Decimal data type. VBA did
support a Variant of subtype Decimal, but had serious trouble displaying it
on a form back then.

I find it extremely embarrassing that JET 4 gets this query wrong:

SELECT MyField FROM MyTable ORDER BY MyField DESC;

If MyField is an unindexed Decimal field, the negative values appear first,
then the positive ones, and the zeros and nulls sort unpredictibly!

Can you believe that this remains unfixed after more than 5 years? Info,
sample, and k.b. reference:
http://allenbrowne.com/bug-08.html

--
Allen Browne - Microsoft MVP. Perth, Western Australia.
Tips for Access users - http://allenbrowne.com/tips.html
Reply to group, rather than allenbrowne at mvps dot org.

"Tom Ellison" wrote in message
...
Dear Allen:

I had never run into it before. It's "new" isn't it. I mean, I knew it
was there since the 2000 version, but I've never had to work with it. Was
it in '97? That's the second version of Access I learned, and the first I
learned thoroughly (oh for the good ole days).

The equivalent in MSDE is very useful. I like it. Data size increases
one byte at a time, and it even tells you the size of storage in bytes
when you're working on it. That's where I got my idea of how I thought
this should work. Apparently, it doesn't work that way at all.

Thanks again, Allen. I'm absorbing as fast as I can tonight.

Tom Ellison


"Allen Browne" wrote in message
...
Another good reason to never use the Decimal data type in JET.



  #9  
Old February 8th, 2006, 08:42 AM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

Dear Allen:

I have been using Currency/Money for almost all numeric columns for some
time (other than good old integer and long, of course). I believe I'll
stick with that.

What I cannot conceive is that any such error would be tolerated in SQL
Server. No such error would ever be released, let alone escape correction
for years. That speaks for itself.

Tom Ellison


"Allen Browne" wrote in message
...
Correct: JET 3.x (Access 97) did not have the Decimal data type. VBA did
support a Variant of subtype Decimal, but had serious trouble displaying
it on a form back then.

I find it extremely embarrassing that JET 4 gets this query wrong:

SELECT MyField FROM MyTable ORDER BY MyField DESC;

If MyField is an unindexed Decimal field, the negative values appear
first, then the positive ones, and the zeros and nulls sort unpredictibly!

Can you believe that this remains unfixed after more than 5 years? Info,
sample, and k.b. reference:
http://allenbrowne.com/bug-08.html

--
Allen Browne - Microsoft MVP. Perth, Western Australia.
Tips for Access users - http://allenbrowne.com/tips.html
Reply to group, rather than allenbrowne at mvps dot org.

"Tom Ellison" wrote in message
...
Dear Allen:

I had never run into it before. It's "new" isn't it. I mean, I knew it
was there since the 2000 version, but I've never had to work with it.
Was it in '97? That's the second version of Access I learned, and the
first I learned thoroughly (oh for the good ole days).

The equivalent in MSDE is very useful. I like it. Data size increases
one byte at a time, and it even tells you the size of storage in bytes
when you're working on it. That's where I got my idea of how I thought
this should work. Apparently, it doesn't work that way at all.

Thanks again, Allen. I'm absorbing as fast as I can tonight.

Tom Ellison


"Allen Browne" wrote in message
...
Another good reason to never use the Decimal data type in JET.





  #10  
Old February 8th, 2006, 12:35 PM posted to microsoft.public.access.tablesdbdesign
external usenet poster
 
Posts: n/a
Default Reduce size of data

Tom

Have you looked into the new Express SQL Server 2005 -- download for free.

Regards

Jeff Boyce
Microsoft Office/Access MVP
Microsoft IT Academy Program Mentor

"Tom Ellison" wrote in message
...
Dear John:

MSDE has same limits, but probably better storage algorithms. I'm not

sure
there would be enough benefit to switching to MSDE. Client will not

upgrade
to full SQL Server. Expense of server OS, SQL Server, plus server

hardware
would be nearly $4K I expect. No where close to what he would spend.

I've used the same procedure in changing to a Single datatype, and the
database dropped by 80% in size. Go figure! It didn't drop even 1%
changing from precision 18 to 7. No explanation from me!

So, as you can see, I had already discussed SQL Server as an alternative.
It's one I certainly like.

My initial results are in. Indexing just one numeric column causes a 83%
increase in performance. Pretty good start. Maybe enough to finish the
project. I just hope Single datatype isn't a problem. It's not dollars

and
cents, but some kind of ratios and such. Probably this will work, though.
I'll have to explain why 9.17 can result in records that show 9.17,
although it's stored as 9.169993. Life is a series of trade-offs, eh?

I understand about the text pointer thing. This app doesn't search for

that
text column. But surely the area where the text is stored makes the

overall
database sensitive to the length of the strings. But I guess that's the
actual length, not the maximum allocated length. So now that makes sense
also. If the text were all padded to full length with spaces, that would

be
different then.

Thanks for your participation.

Tom Ellison


"John Nurick" wrote in message
...
AFAIK Jet stores Decimals like dBase does: no nibbles, but one byte per
digit, so at least 18 bytes for a decimal of precision 18. However, if
you only need 7 digits precision you could replace your Decimals with
Longs and hard-code the position of the decimal point.

OTOH Jet does not store empty space if a Text field is not full: a
10-character string will occupy the same amount of space whether the
field size is 10 or 255. (This of course means storing a size and a
pointer for each value.)

Bottom line: you're very near the limits of what Jet's designed for, and
some flavour of SQL client/server looks like the sensible way to go.


On Tue, 7 Feb 2006 22:32:54 -0600, "Tom Ellison"
wrote:

Allowing two digits (nibbles) per byte, a decimal of precision 18 should
require 10 bytes to store (add one nibble for a sign and divide by two,
rounding up). I chose 7, although 6 would have been sufficient for all
existing data, since I expect it's going to add one for a sign anyway,

so
6
+ 1 = 7 would be 4 bytes, and 7 + 1 = 8 would be 4 bytes anyway, so

there
would be no savings there. Just my theory.


--
John Nurick [Microsoft Access MVP]

Please respond in the newgroup and not by email.




 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Query does not order the same data in the same way Milković Aleksandar Running & Setting Up Queries 10 December 15th, 2005 12:01 PM
Microsoft Office should use an advance data crosslinking model m_jurrens General Discussions 0 December 15th, 2005 07:33 AM
How to Reduce Spreadsheet Size and Speed ExcelMonkey Worksheet Functions 4 August 9th, 2005 06:38 PM
multiple docs, one data source kp Mailmerge 12 January 31st, 2005 04:41 PM
VBA Code problem error 9 Speedy General Discussion 19 October 15th, 2004 09:05 PM


All times are GMT +1. The time now is 12:39 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 OfficeFrustration.
The comments are property of their posters.