Looking for recommendation on how to proceed.

Glenn Steen glenn.steen at gmail.com
Mon Aug 28 18:46:22 IST 2006


Sorry, but this one isn't really related to MailScanner. Just some
simple scripting:-).
Rather OT, so don't read unless you're interrested.

On 28/08/06, Chris Hammond <chris at tac.esi.net> wrote:
> Hmm.  I'm surprised I didn't find that utility in my travels.  Thanks for pointing this one out.
>
> Thanks
> Chris

Yes, well... there is always one more way:-).

Note that both methods have their "pitfalls":). The comm command needs
input to be sorted beforehand (you'd be looking for something like
"comm -1 -3 ..."), while my suggested little cat thing is sensitive to
duplicates in the respective file. The latter can be "unified", so
that it works (of course:-)... If one wants a "one-liner" it'd look
something like:
((sort -u file1;sort -u file1; sort -u file2) | sort | uniq -u
... perhaps not the most intuitive thing:-).
This will demonstrate the differences:
b1 and b2 are files with just some random letters (one char/line). b1s
and b2s is the same files, only sorted. Further comments below the
examples:
[root at mail ~]# cat b1
a
b
a
b
c
d
w
[root at mail ~]# cat b2
a
b
d
a
b
f
q
c
d
o
o
w
[root at mail ~]# cat b1 b1 b2|sort|uniq -u
f
q
[root at mail ~]# (sort -u b1;sort -u b1;sort -u b2)|sort|uniq -u
f
o
q
[root at mail ~]# sort b1 >b1s
[root at mail ~]# sort b2 >b2s
[root at mail ~]# comm -1 -3 b1s b2s
d
f
o
o
q
[root at mail ~]# sort -u b1 >b1su
[root at mail ~]# sort -u b2 >b2su
[root at mail ~]# comm -1 -3 b1su b2su
f
o
q
[root at mail ~]#

As you can see, the "cat ..." misses the "o" line, since that gets
gobbled by uniq -u. The "(sort ..." is correct, and only give the
lines that only are in file b1.
The "comm ..." includes any repeats and get a bit confused when the
lists are only sorted. That could be handled by sort -u when creating
b1s/b2s as shown last. I think I prefer the "(sort..." solution:-):-)
(snip)

-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se


More information about the MailScanner mailing list