Satya's blog - 2008/01/

Jan 21 2008 22:01 Random nature pictures
Took 3 random nature pictures that came out nicely. One's at the Charleston, SC aquarium -- turtle and snake and the other two are pictures of trees after rain, but I wans't able to get the raindrops as nicely as I wanted. (Pic1 and Pic2)

Tag: pictures

Jan 20 2008 07:35 EVE Online on Linux
So I'm trying out EVE Online because LinuxFUD said it works on Linux. Sure enough, there's a .deb package. I download it. 800 kilobytes? Not a good sign. Yep, it's just an installer. Which downloads 800 *mega*bytes, and the game wants to run in a windows emulator called cedega. "Linux support" my furry behind.

Tag: game

Jan 07 2008 14:05 Trust-based spam detection

In this post, I outline a method to determine if a remote mail server attempting to connect to "our" mail server is trying to send spam. By querying a trust network with the remote server's IP, and using trust metrics, we determine if others think this is likely to be a spammer.

Just thought of something. Suppose my mail server had a spam detector that was asked, "hey, spam detector, this is the mail server. I have this other server, 10.0.0.1, wanting to send me email. Is that a spammer?" (Yes, I know 10.0.0.1 is private and probably a gateway router. Shut up and read.) So the detector goes "hold on, let me ask my friends" and asks several known (programmed in by the admins or something) "friends", i.e. other spam detectors, call them A, B, C, D, E. It asks, "Is 10.0.0.1 a spammer, in your opinion?" They reply as follows:

AYes10.0.0.0/16 is a bunch of no-good spammers
BYes10.0.0.1 is a spammer, my blocklist says so
CNoBased on what it hears from its trust group, except Zero
DYesI'm secretly lying, based on the moon phase
ENoI know 10.0.0.1 sends both spam and ham, so I'll err on the side of caution

Unknown to our spam detector, call it Zero, the reasoning is as in the third column above. Again, the reasoning is unknown to Zero, it only gets the yes/no answers.

Now, Zero trusts its friends by different amounts, as follows: A: 0.2 B: 0.2 C: 0.6 D: -0.2 (it knows D is erratic, or based on other reasons below) and E: 0.7 That gets it a 0.2+0.2+(-0.2) - (0.2 * 0.2 * -0.2) for "yes", and 0.6 + 0.7 - (0.6*0.7) as a "no", which is ~0.4 yes, ~0.9 "no". So Zero decides 10.0.0.1 is okay, and tells our mail server the same.

Now how did the others arrive at their conclusions? Based on blocklists, reported spam, reported false positives/negatives, and their own trust network (excluding Zero, who's doing the asking). Each one cached the trust metrics (we used thinly-disguised probabilities here, but anyone can use whatever they want -- their trust metric may suffer).

Also, based on arriving at a "probably not spammer" answer, Zero decided to trust A, B, and D slightly less in the future. How much less? That's up to Zero, but maybe something like "it was close, so I'll reduce them by 0.01 each and increase the ones that agree by 0.01" or "it was wildly off the mark, let's reduce all the wrong 'uns by 0.1 and increase all the right ones by 0.02 (because I don't want to trust anyone too much)." These algorithms may also vary, and may have upper/lower limits, slew rates, admin-determined limits, and so forth.

The result of Zero's determination can be fed into spamassassin, too.

I've heard of distributed signature-based spam detectors. I think razor is one? I also searched around a bit before writing this post (and discussed it in chat), so I found this abstract: Establishing Trust Between Mail Servers to Improve Spam Filtering. I thought of the whole idea while in the kitchen. I think it differs from most others by using just the IP of the connecting mail server, rather than hashing the entire message or headers.

Feedback from those who understand such things is appreciated. Please don't tell me it's stuupid (sic). That's not constructive.

Last updated: Jan 07 2008 14:40

Tag: hmm geeky

Jan 03 2008 13:14 ldapsearch and a couple of other things

First, because I keep forgetting, here's ldapsearch: ldapsearch -vvv -Hldaps://[host] -D "uid=...,ou=People,dc=...,dc=..." -b "dc=....,dc=..." -x -S ??? "(uid=...)" -W (-S sorts by given attribute, -x is simple auth, -W is prompt for password to bind with)

Now, in a Ruby script, here's how to authenticate someone:

require 'ldap'

class LdapUser
    @@host= '[host]'
    @@port= xxx
    @@base= 'ou=People,dc=...,dc=.'
    def self.authenticated?(user, pass)
        return false if (user.blank? or pass.blank?)
        name=false
        begin
            conn = LDAP::SSLConn.new(@@host, @@port, false)
            if conn.bind("uid=#{user},#{@@base}",pass)
                i=0
                conn.search(@@base, LDAP::LDAP_SCOPE_ONELEVEL, "(uid=#{user})", ['cn']) {|entry|
                    i+=1
                    return false if i> 1
                    name=entry.vals('cn')[0]
                }
            end
        rescue 
            return false
        end
        return name
    end
end

This uses.... libldap-ruby? I think. According to mcg, anyway.

Update: to search ldap without having to log in, use this: ldapsearch -vvv -Hldaps://[host] -D "ou=...,dc=...,dc=..." -x "(uid=...)"

Last updated: May 01 2008 13:01

Tag: geeky howto