Detecting grids of drug names
David While
David.While at UCE.AC.UK
Wed Nov 16 11:00:06 GMT 2005
I put both these rulesets into my setup and so far I have only seen hits
on Matt's.
--------------------------------------------
David While BSc CEng MBCS CITP
Department of Computing
University of Central England
Tel: 0121 331 6211
--------------------------------------------
-----Original Message-----
From: MailScanner mailing list [mailto:MAILSCANNER at JISCMAIL.AC.UK] On
Behalf Of Matt Kettler
Sent: 14 November 2005 21:06
To: MAILSCANNER at JISCMAIL.AC.UK
Subject: Re: Detecting grids of drug names
Julian Field wrote:
> I have produced a rule which detects grids of letters. They are using
a
> table trick to rotate the words by 90 degrees so the letters of the
> first column all come first, followed by all the letters of the second
> column and so on. This stops you detecting words with HTML junk in
> between the letters.
>
> But I can now detect these grids:
>
> rawbody JKF_DRUG_GRID1 /(\>([[:alpha:]]\s){4}[[:alpha:]].*){4}\>/i
> describe JKF_DRUG_GRID1 Grid of letters rotated to produce drug names
> score JKF_DRUG_GRID1 4.5
>
> This detects grids of at least 4x4 characters, which is small enough
to
> detect drug names.
> The first "4" sets the minimum number of rows in the grid, the second
> "4" sets the minimum number of columns.
>
> Quite succinct once you work out what you are looking for :-)
> All improvements and comments are most welcome.
>
Julian, I had a similar to a concept on Friday..
Mine work a bit differently, these look for a specific drug name in the
post-htm-stripped text. Thus far it works quite well, but I've got the
scores
low as I'm testing them still.
See attached.
------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).
Support MailScanner development - buy the book off the website!
------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).
Support MailScanner development - buy the book off the website!
More information about the MailScanner
mailing list