A Microsoft Office (Excel, Word) forum. OfficeFrustration

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » OfficeFrustration forum » Microsoft Word » Formatting Long Documents
Site Map Home Register Authors List Search Today's Posts Mark Forums Read  

Word doc analyzer tool?



 
 
Thread Tools Display Modes
  #1  
Old August 16th, 2006, 11:56 PM posted to microsoft.public.word.formatting.longdocs
Graham Wideman [Visio MVP]
external usenet poster
 
Posts: 7
Default Word doc analyzer tool?

Folks:

Are there any useful tools for analyzing and troubleshooting Word docs?

I'm thinking along the lines of a tool that might read (via Automation) all
the data in the Word document model, and present all the data in some
intelligent fashion -- suitable for troubleshooting oddball problems.

Anything in that neighborhood?

Graham


  #2  
Old August 17th, 2006, 01:34 PM posted to microsoft.public.word.formatting.longdocs
John McGhie [MVP - Word and Word Macintosh]
external usenet poster
 
Posts: 502
Default Word doc analyzer tool?

Hi Graham:

Yes. It's a tool called "Word" :-)

Save the document as "Web Page" (better: as "XML").

Web Page saves as XHTML, which is somewhat easier for humans to read. XML
is easier for machines to read. Either of them will get you the entire
content of the document,

Alternatively, you can save the document as RTF. RTF is very close to
Word's native format. It's huge and very convoluted, but if you open RTF in
a Text editor, you can see exactly what's in there.

Note that a Word document can include a large number of binary objects such
as graphics.

A couple of caveats:

1) If there's anything much wrong with the document, Automation cannot read
it either. Automation depends upon the internal collections in the document
object model being present and intact. If they're not, the object model
collapses.

2) Even if you get the data out, it becomes a huge job to try to analyse
what's wrong. Many of the problems you get in a Word document are due to
excessive levels of abstraction overflowing internal buffers. The code may
be "legal", but it becomes so complex that Word runs out of memory trying to
read it.

XML gives you your best shot: if you get the document out to XML, you can
read and correct most things if you know WordML very well. Regrettably, if
there's much wrong with the document, the XML output filter will fail to
complete the save.

Sorry!


On 17/8/06 8:56 AM, in article ,
"Graham Wideman [Visio MVP]" wrote:

Folks:

Are there any useful tools for analyzing and troubleshooting Word docs?

I'm thinking along the lines of a tool that might read (via Automation) all
the data in the Word document model, and present all the data in some
intelligent fashion -- suitable for troubleshooting oddball problems.

Anything in that neighborhood?

Graham



--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 (0) 4 1209 1410

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 01:41 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 OfficeFrustration.
The comments are property of their posters.