Need formula to extract a numeric value from a free-format tex
On Mon, 20 Jul 2009 13:16:01 -0700, Eric_NY
wrote:
I see what you mean.
The problem I was getting was that the "\b" was excluding some cases that I
found in my data (such as "SR1234567" and "1234567remedy"). there were other
cases with a non-space immediately adjacent to the 7-digit sequence. So I
just took out the "\b" part of the pattern.
Aha. I see the problem. It's possible (but not necessary in view of what you
wrote below) to account for that. For example, one could look for the 7 digits
to be bounded by either a non-digit or the beginning or end of the line.
(I was also embarrassed when I presented the results and realized that my
7-digit numbers actually began at 987262 - i.e., a 6-digit number, so not all
of them were in fact 7 digits. So my logic was wrong and I missed one that I
should have found.)
Again, if you required a permanent solution, that could be adjusted for.
But for my current purposes this is good enough.
They do say that "perfect is the enemy of good enough" :-)
I'm doing a one-time
analysis of several thousand records, and don't need to develop a permanent,
perfect solution. I revised the regular expression to be "good enough"
considering the data that I saw in front of me.
Many thanks for your help.
Glad to. I learn also. Thanks for the feedback.
--ron
|