MS ruleset file sizes

Lindsay Snider lindsay at pa.net
Fri Mar 21 22:37:52 GMT 2003


On Friday 21 March 2003 16:05, you wrote:
> Hi!
>
> > >So whats the current limitation, how long could those files be ? :)
> >
> > As I have said elsewhere, the best approach for large config files is to
> > slurp them all in from a database at startup time, then look them up in
> > local hash tables at run time. I can't see tens of thousands causing much
> > of a problem. There is no hard-wired limit at all.
>
> I'll do some filed testing with large lists soon, will post some reports
> on the list when i am ready.
>

In the next week or two, I too plan to test w/ somewhere in that range of 
mailbox rules.  I too will share what I find.

Would a database work for files that are order dependent?  If the current data 
is not order dependent I'll save you time, stop reading, the rest of this 
email means nothing.... :)

>From the source, it looks like each file is loaded into an array.  Thus, 
searches on the array will be linear.  Am I understanding the code correctly?  
Each file is stored as an array and then each array is put into a hash based 
on the variable name of the file?

Does anyone have performance data on when the number of rules might become 
relevant?  If someone does need 'log n' performance, maybe parts of the file 
can be loaded into a memory hash where the order doesn't matter.  With a goal 
of minimal changes/hacking in mind, here's an idea.

Maybe, within the data files, comments could be used to start and end sections 
of entries that are not order dependent.  Then as a modification to 
mailscanner, we can look for those comment.  All data in between the comments 
could be treated as one element in the array.  That element would be a 
reference to a hash of the elements within.  Then when reading the arrays, 
we'd watch for hash references and search down through them when found.

It's just an idea, any thoughts?  If needed, is there any communal interest in 
working on this?

I really like the modularity and readability of the code.  Cheers to Julian!!

lindsay



> Thanks,
> Raymond.




More information about the MailScanner mailing list