Pages:
1
..
22
23
24
25
26
..
28 |
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
BotKilla in action.
I sort of gave it an overhaul today, that allowed me to run the spam tests independently of the main script. This makes it a lot easier for me to
test spam that isn't caught in the filter and quickly write and test new rules that might do a better job of catching it.
Any spam that does get through the filter, it'd help to move it to Detritus so I can take a look at it later and see how it got through.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
streety
Hazard to Others
Posts: 110
Registered: 14-5-2018
Member Is Offline
|
|
Are you checking for false positives?
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Yes, so far just that one, and that was because of a glitch in the way links were checked against the whitelist. I PMed him and told him what
happened, and included the text of his post in case he wanted to repost it. I don't think he's been back in a few days though. Once I fixed that
glitch, it no longer got flagged. Really, the scrutiny is just a lot more on recently-registered users, especially when posting links. His link
would have been whitelisted though, had the whitelisting been working correctly at the time. I also raised the threshold for what gets deleted and
added a few more conditions that can trigger additional flags. The "linking from unrecognized domain" flag has really been a key predictor of spam,
by a long shot. And the "registered today" flag too, just because it focuses on the narrow subset of members who post 99% of the spam.
The algorithm has worked flawlessly, actually, it's just a stupid bug in my code that was responsible for that one false positive.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Maybe botkilla need another feature:
temporarily ban a new account if it has more than a certain # of spam posts deleted, say 5?
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
I only have moderation privileges, I think banning someone requires an admin. Not that it matters, since it looks like spam is almost always deleted
within a minute after being posted. Really, this is a temporary measure to make this board usable until we can commit to switching forum software,
and make sure we're all on the same page as far as how we go about it. Obviously, the sooner we do this, the better.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
phlogiston
International Hazard
Posts: 1379
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline
Mood: pyrophoric
|
|
Does BotKilla have a day of or did the spammers find a way to circumvent it?
I just browsed through 5 pages of today's posts and found just 10 'real' threads in them.
-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Oops. There was some strange character encoding in one of the spam posts that crashed BotKilla, and I didn't notice what happened until just a little
while ago. I think I've covered all the bases now, but the only way to know for sure is to just let it run and keep checking it.
Really though, this is turning into quite a lot of responsibility. It's nothing I can't handle, but it would certainly help if there was compensation
that came with it. I guess I should probably start a new thread on the subject to see what people think.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Quote: Originally posted by Melgar | Really though, this is turning into quite a lot of responsibility. It's nothing I can't handle, but it would certainly help if there was compensation
that came with it. I guess I should probably start a new thread on the subject to see what people think. | Maybe put the thread in whimsy?
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Hidden spam in signature:
https://www.sciencemadness.org/whisper/viewthread.php?tid=65...
|
|
unionised
International Hazard
Posts: 5126
Registered: 1-11-2003
Location: UK
Member Is Offline
Mood: No Mood
|
|
What the hell is it with mangosteens?
Anyway, I thought I'd say thanks to whoever else was involved in the episode about 15 minutes ago where a whole bunch of spam was quickly detected +
wiped.
Very satisfying.
|
|
j_sum1
Administrator
Posts: 6320
Registered: 4-10-2014
Location: At home
Member Is Offline
Mood: Most of the ducks are in a row
|
|
Just you and Melgar's botkilla this thme unionised.
I got no idea what a mangosteen is either and I am no googling it.
|
|
unionised
International Hazard
Posts: 5126
Registered: 1-11-2003
Location: UK
Member Is Offline
Mood: No Mood
|
|
Nice to know I achieved something today.
A mangosteen is a fruit, BTW, Nice enough. Nothing special except it's a bit of a novelty for most people in the
UK or US
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Quote: Originally posted by unionised | Nice to know I achieved something today.
A mangosteen is a fruit, BTW, Nice enough. Nothing special except it's a bit of a novelty for most people in the
UK or US | Mangosteen is also a...typhoon aka hurricane/cyclone
|
|
Tsjerk
International Hazard
Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline
Mood: Mood
|
|
Very nice work done Melgar! could I have a copy of your code? Out of curiosity
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Quote: Originally posted by unionised | What the hell is it with mangosteens?
Anyway, I thought I'd say thanks to whoever else was involved in the episode about 15 minutes ago where a whole bunch of spam was quickly detected +
wiped.
Very satisfying. |
Oh right. If my U2U message count shoots up due to spam reports, that's my cue to check on the BotKilla script. Indeed, it had malfunctioned and
needed to be restarted. Since I let this thing loose with a lot of untested code initially, I've been operating under the principle that it's better
to have it stop when it malfunctions. That way, I can see what caused the problem, and there isn't a malfunctioning bot on the loose, deleting posts.
But like the last three times that's happened, it's been some sort of temporary network error, so next time that happens I'll set it to ignore that
specific error and just keep trying again until it works.
When it finally does get running again, it keeps track of how many spam posts it's deleted per hour, and I admit that seeing spam text scroll past
faster than I can read it, and watching that rate temporarily shoot up into the tens of thousands is quite satisfying.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Ok, here's the main script :
https://github.com/toldani/sm-transition/blob/master/botkill...
And here's the domain whitelist that links are checked against:
https://github.com/toldani/sm-transition/blob/master/sm-link...
No single flag is enough to trigger a deletion, but most spam is deleted when a user's first post includes a link to a domain not on the whitelist.
The whitelist was generated by scanning 14 years of forum history and storing all the domains that were linked to at least ten times.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
unionised
International Hazard
Posts: 5126
Registered: 1-11-2003
Location: UK
Member Is Offline
Mood: No Mood
|
|
I'm sure teh question was asked earlier but... can you get the code to spot posts in cyrillic?
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Yeah, it already does. Also, Chinese, Tagalog, Arabic, and Thai. Although I've really only seen much Cyrillic and Chinese.
Right now, it only gets new threads and not spam that's been tacked onto the end of existing threads, but that's mostly because that spam is less
common and it usually gets deleted before I have a chance to figure out how to detect it.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
Texium
Administrator
Posts: 4580
Registered: 11-1-2014
Location: Salt Lake City
Member Is Offline
Mood: PhD candidate!
|
|
In case anyone was wondering, I reported the Melgar impersonator as Spam in order to swiftly destroy it. So that's where it went.
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Oh, ok. I was kinda wondering what happened with that.
It's nice to see so many new threads being started that aren't spam. Also, I've
configured BotKilla to wait five minutes after it crashes, then start running again. The main thing crashing it now seems to actually be if a thread
gets deleted between when BotKilla determines a thread to be spam, and when it tries to navigate to that thread to delete it. So really it was just a
404 error that only became a problem because I hadn't set up error handling in the "delete thread" function. But since it seems to cause way more
problems when it's down than when it's up and malfunctioning slightly, it should now recover from any errors that it throws, and go back to hunting
spam.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
Mr. Rogers
Hazard to Others
Posts: 184
Registered: 30-10-2017
Location: Ammonia Avenue
Member Is Offline
Mood: No Mood
|
|
I haven't seen any spam in two days. Something is obviously working.
|
|
12thealchemist
Hazard to Others
Posts: 181
Registered: 1-1-2014
Location: The Isle of Albion
Member Is Offline
Mood: Rare and Earthy
|
|
I'm afraid you've just jinxed it. Two spam posts this evening, but different to the usual ones - paper writing and increased business
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
I lowered the refresh time from 60 seconds to 30 seconds, meaning that spam that's slated for automatic deletion will only be visible half as long
now. It's not actually that big of a deal, but if you guys don't mind, could you just wait about a minute to see if spam gets deleted automatically
before reporting it? Most of the spam that gets reported actually does end up getting auto-deleted a few seconds later, actually.
The only spam I've seen getting through regularly are these super vague posts with no links and terrible English. Since those are the epitome of
pointlessness though, I'm not worried about them, and have just been deleting those manually.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Quote: Originally posted by Melgar | It's nice to see so many new threads being started that aren't spam. Also, I've
configured BotKilla to wait five minutes after it crashes, then start running again. | Why not config it to
restart sooner, preferably instantly?
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Sometimes the reason it's crashing is because there's content in a spam message that causes a crash. In this case, BotKilla has to wait for a human
moderator to delete the spam that it got stuck on. If it just restarted instantly, it could get stuck in an infinite loop until it goes haywire and
starts marauding through the forums destroying everything it sees.
Actually it'd probably just generate a huge repetitive log file and crash in a way that it can't recover from so easily, but a malfunctioning robot
wearing Crips colors and killing everything in its path is more fun to imagine, no?
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
Pages:
1
..
22
23
24
25
26
..
28 |