Detecting grids of drug names
Julian Field
MailScanner at ecs.soton.ac.uk
Mon Nov 14 19:11:46 GMT 2005
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "US-ASCII" character set. ]
[ Some characters may be displayed incorrectly. ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I have produced a rule which detects grids of letters. They are using a
table trick to rotate the words by 90 degrees so the letters of the
first column all come first, followed by all the letters of the second
column and so on. This stops you detecting words with HTML junk in
between the letters.
But I can now detect these grids:
rawbody JKF_DRUG_GRID1 /(\>([[:alpha:]]\s){4}[[:alpha:]].*){4}\>/i
describe JKF_DRUG_GRID1 Grid of letters rotated to produce drug names
score JKF_DRUG_GRID1 4.5
This detects grids of at least 4x4 characters, which is small enough to
detect drug names.
The first "4" sets the minimum number of rows in the grid, the second
"4" sets the minimum number of columns.
Quite succinct once you work out what you are looking for :-)
All improvements and comments are most welcome.
- --
Julian Field
www.MailScanner.info
Buy the MailScanner book at www.MailScanner.info/store
Professional Support Services at www.MailScanner.biz
MailScanner thanks transtec Computers for their support
PGP footprint: EE81 D763 3DB0 0BFD E1DC 7222 11F6 5947 1415 B654
-----BEGIN PGP SIGNATURE-----
Version: PGP Desktop 9.0.2 (Build 2424)
iQA/AwUBQ3jhcxH2WUcUFbZUEQKvKwCfaNamPtR7k1aZW0UIDWtTujB6eLYAni0B
o6WFlyHWf9byYvtqKlQbox1Y
=Z2aP
-----END PGP SIGNATURE-----
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).
Support MailScanner development - buy the book off the website!
More information about the MailScanner
mailing list