Detecting grids of drug names

David While David.While at UCE.AC.UK
Wed Nov 16 11:00:06 GMT 2005


I put both these rulesets into my setup and so far I have only seen hits
on Matt's.

--------------------------------------------
David While BSc CEng MBCS CITP
Department of Computing
University of Central England
Tel: 0121 331 6211
-------------------------------------------- 

-----Original Message-----
From: MailScanner mailing list [mailto:MAILSCANNER at JISCMAIL.AC.UK] On
Behalf Of Matt Kettler
Sent: 14 November 2005 21:06
To: MAILSCANNER at JISCMAIL.AC.UK
Subject: Re: Detecting grids of drug names

Julian Field wrote:
> I have produced a rule which detects grids of letters. They are using
a
> table trick to rotate the words by 90 degrees so the letters of the
> first column all come first, followed by all the letters of the second
> column and so on. This stops you detecting words with HTML junk in
> between the letters.
> 
> But I can now detect these grids:
> 
> rawbody  JKF_DRUG_GRID1 /(\>([[:alpha:]]\s){4}[[:alpha:]].*){4}\>/i
> describe JKF_DRUG_GRID1 Grid of letters rotated to produce drug names
> score    JKF_DRUG_GRID1 4.5
> 
> This detects grids of at least 4x4 characters, which is small enough
to
> detect drug names.
> The first "4" sets the minimum number of rows in the grid, the second
> "4" sets the minimum number of columns.
> 
> Quite succinct once you work out what you are looking for :-)
> All improvements and comments are most welcome.
> 


Julian, I had a similar to a concept on Friday..

Mine work a bit differently, these look for a specific drug name in the
post-htm-stripped text. Thus far it works quite well, but I've got the
scores
low as I'm testing them still.

See attached.

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!



More information about the MailScanner mailing list