Quote:
Originally Posted by UK Jimbo
Hi Michael,
I couldn't get hold of you on MSN Messenger so I've penned down some thoughts/questions here:
New code
I like the style, comments and fact that it's been called from PHPINCLUDE_START good work 
|
Thanks
Quote:
Blacklists
What do you think the best way of storing the blacklist(s) and making it editable is? I wonder if a phrase would be a good plan (even if it was managed through a custom part of the admincp rather than the phrase manager).
|
Since the master blacklist gets updated a lot I'd like to keep it seperate from the local list. While storing as txt file is optional, storing as a template needs to be an option for those who have file systems set up such that php can't write to the file system (for this same reason vbulletin has the option to retain CSS definitions in the page itself although this is far less efficient).
The local list needs to be a template for quick accessability.
At 50K and growing, I don't think it's gonna fit in the phrase system.
At some point we need a cron job to go to jay allen's site and pull down the updates to the list (He's given his permission for this). To spare his bandwidth, we need to get the system to only do a full refresh when requested from the admincp. Most of the time the system should download the latest 100 additions about once every 3 to 5 days.
Quote:
Multiple fields
What are your thoughts on breaking down the fields passed into the "spam engine"? I'm thinking along the lines of the way that the second version of spamBuster was able to have rules relating to the body text or the subject. Username might be anoter field worth matching against - lots of spammers seem to use the recipe [username][number] like robby34. Perhaps something to worry about later.
|
On large boards such as mine there are many legit users that use numbers in their user names. Hence it would be difficult if not impossible to make it a useable discernment.
What would be idea is an algorythm to iterate over the user's signature, post, and title, extract all URL's and put them in an array. Then compare these arrays for a match. Depending on the number of matches we can extract domain names from the array and add them to the local list.
The message itself should be scanned for spammyness. Repeated use of the $ character, FREE in all caps, and maybe use an unusual words list (user definable) for words that shouldn't occur on a normal basis - viagra for example.
Quote:
Are you happy with me going ahead and writing a lower level library that does the spam processing and leaving some of the vBulletin integration (admincp code) to you?
|
Sounds good. I'll start with the installer to set up the vboptions for this hack.
As far as functions - right now the code here has the actions taken inside the searching loop. To be honest these need to be seperate. Set up some kind of static variable to cound matches and return it, and put the ban action in a seperate function (or look into the possibility of using the existing ban functions. BTW, I noticed that in spam buster you wrote a routine to send mail - there's already a mail function in vbulletin: vbmail. It's defined in the functions library with is included on all executions of the vbulletin code.
Quote:
Looking at those changes who'd guess I've developed in Perl a fair bit? :ermm:
|
I personally avoid PRCE expressions like the plague, but sometimes they're the only way to go