Pages:
1
..
21
22
23
24
25
..
28 |
Tsjerk
International Hazard
Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline
Mood: Mood
|
|
cyrillic you mean?
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Ok, my proprietary spam-destroying system "botkilla" has been activated. There was just an insane amount of spam, and even with the threshold turned
down pretty low, there wasn't a single false positive. So I'm going to try leaving it on overnight. It runs every three minutes, and so far has
gotten all the spam that's been posted in the last few hours. I'll probably tweak the sensitivity tomorrow, if all goes well. In the meantime, if
you want to test it, try starting a new thread with the words "mature" or "adult" or "galleries" or Cyrillic characters in the title and see how long
it stays up.
I'll make it so that there's additional tests for those cases, but in the meantime, I think most of you will notice a huge drop in the amount of
visible spam now. And with the moderators no longer having quite as many spam reports to deal with, we might actually be able to respond to U2U
messages, if this actually works.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
j_sum1
Administrator
Posts: 6320
Registered: 4-10-2014
Location: At home
Member Is Offline
Mood: Most of the ducks are in a row
|
|
Quote: Originally posted by Melgar | In the meantime, if you want to test it, try starting a new thread with the words "mature" or "adult" or "galleries" or Cyrillic characters in the
title and see how long it stays up. |
I did just that. here
It seems to be staying up -- it has been longer than 3 mins and there have been no replies in the thread. I will guess that the reason is that it is
noit a new registration. (Does your botkilla test for that?) I might try again later with a dummy registration.
[edit]
Ok. I registered with the name, jsumxgwsukspamtester and posted something with a spammy title. It lasted only a couple of minutes before vanishing.
Account is not deleted or banned but the garbage is gone.
Well done, Melgar.
<clap><clap><clap><clap><clap><clap><clap><clap><clap><clap>
[Edited on 10-10-2018 by j_sum1]
|
|
streety
Hazard to Others
Posts: 110
Registered: 14-5-2018
Member Is Offline
|
|
Nice to see some automation. From your description and the behavior of your bot I assume you are not re-using any of the program I sent you.
I can also send you all the spam messages my script has collected over the past few months. Hopefully it will help with training your own
implementation. I'll wait for the current batch to be cleared.
[edit]Some of the spam posts have 80+ views. I don't think I've ever seen that before. Melgar, will that be your script?
[Edited on 10-10-2018 by streety]
|
|
streety
Hazard to Others
Posts: 110
Registered: 14-5-2018
Member Is Offline
|
|
The attached file contains a sqlite database file with three tables: topic, member and forum.
This is the definition for the tables:
Code: | from sqlalchemy import Column, Integer, Text, DateTime, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Topic(Base):
__tablename__ = 'topic'
id = Column(Integer, primary_key=True)
num_posts = Column(Integer, )
num_views = Column(Integer, )
first_post = Column(Text, )
topic_title = Column(Text, )
topic_id = Column(Integer, index=True, )
post_date = Column(DateTime)
delete_date = Column(DateTime, nullable=True, )
forum_id = Column(Integer, ForeignKey('forum.id'))
member_id = Column(Integer, ForeignKey('member.id'))
def __init__(self, num_posts, num_views, first_post,
topic_title, topic_id, post_date, forum_id, member_id):
self.num_posts = num_posts
self.num_views = num_views
self.first_post = first_post
self.topic_title = topic_title
self.topic_id = topic_id
self.post_date = post_date
self.forum_id = forum_id
self.member_id = member_id
def __repr__(self):
return '<Topic %r>' % (self.topic_title)
class Member(Base):
__tablename__ = 'member'
id = Column(Integer, primary_key=True)
name = Column(Text, )
register_date = Column(DateTime)
deleted_date = Column(DateTime, nullable=True, )
topics = relationship('Topic', backref='member', lazy='dynamic', )
def __init__(self, name, register_date):
self.name = name
self.register_date = register_date
def __repr__(self):
return '<Member %r>' % (self.name)
class Forum(Base):
__tablename__ = 'forum'
id = Column(Integer, primary_key=True)
forum_id = Column(Integer)
name = Column(Text, )
topics = relationship('Topic', backref='forum', lazy='dynamic', )
def __init__(self, forum_id, name):
self.forum_id = forum_id
self.name = name
def __repr__(self):
return '<Forum %r>' % (self.name) |
If a post has been deleted the delete_date field will be set.
Attachment: db.sqlite.tar.gz (4.8MB) This file has been downloaded 746 times
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
I'm posting a more detailed description of what botkilla looks for in Whimsy, but I'll mention here that the script does keep logs of every
post it deletes. Here's the one j_sum1 seems to have made:
Code: | title: mature spam poke out
username: jsumxgwsukspamtester
replies: 0
last_poster: jsumxgwsukspamtester
tid: 96794
spam_score: 12
flags:
- spam words in title
- registered since yesterday
thread_text: "Just a quick test of the new botkilla<br />\r\nj_sum1" |
Here's the link to a longer description of what it does, for whoever can access Whimsy:
http://www.sciencemadness.org/talk/viewthread.php?tid=96884
I think I'm going to have to have to implement all the rules manually, just because I want to lower the chances of false positives as much as
possible.
[Edited on 10/10/18 by Melgar]
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
JJay
International Hazard
Posts: 3440
Registered: 15-10-2015
Member Is Offline
|
|
Modern antispam software has very little trouble with false positives. Quit making excuses; if you're incompetent, turn in your resignation.
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Yes, I took the script down for like a half hour while I worked on it, and that's what's accumulated during that time period. Look at the times on
all of those posts.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
0 posts?
|
|
j_sum1
Administrator
Posts: 6320
Registered: 4-10-2014
Location: At home
Member Is Offline
Mood: Most of the ducks are in a row
|
|
That's how much post count really means.
|
|
Tsjerk
International Hazard
Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline
Mood: Mood
|
|
@fusso; what do you mean with that post? It is a guy who registered in 2011 and he didn't post anything. So what?
@Melgar; nice work! No more spam for at least a couple of days!
[Edited on 14-10-2018 by Tsjerk]
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Quote: Originally posted by Tsjerk | @fusso; what do you mean with that post? It is a guy who registered in 2011 and he didn't post anything. So what? | I said something wrong?!:O
|
|
Tsjerk
International Hazard
Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline
Mood: Mood
|
|
No, nothing wrong, I just don't understand what you mean with that screenshot.
Ah ok, it is a post but his count is on zero, well, I wouldn't worry too much about it.
[Edited on 14-10-2018 by Tsjerk]
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Quote: Originally posted by Tsjerk | No, nothing wrong, I just don't understand what you mean with that screenshot.
Ah ok, it is a post but his count is on zero, well, I wouldn't worry too much about it.
[Edited on 14-10-2018 by Tsjerk] | My date format is yy/mm/dd, so that bot is registered in 2018, not 2011
The screenshot shows the bot's spam post but its post count is 0, and these 2 observations contradict each other.
[Edited on 18/10/14 by fusso]
|
|
Tsjerk
International Hazard
Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline
Mood: Mood
|
|
Ah, then I understand why I didn't get it
Eidt: All spam is disappearing in minutes!
[Edited on 15-10-2018 by Tsjerk]
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
Nice! Glad it's working.
It probably catches a new spam post every 5-10 minutes. There was a bug in my script that would make it stop running if several unlikely things
happened at once. It's fixed now though, so that shouldn't happen anymore.
I don't know why that user has no posts, but I recognize that user as a spammer, since I've been periodically checking the logs. And I'm not going to
worry about a spammer's post count.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
phlogiston
International Hazard
Posts: 1379
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline
Mood: pyrophoric
|
|
Melgar, THANK YOU! It is such a joy to find nearly only non-spam in 'today's post' and browse for interesting contributions again.
-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
@Melgar
Can your program detect spam posts in other threads?
|
|
j_sum1
Administrator
Posts: 6320
Registered: 4-10-2014
Location: At home
Member Is Offline
Mood: Most of the ducks are in a row
|
|
No.
Report those. Or, preferably, send a mod a u2u. These will need a manual deletion. The volume of "reported post" u2us has gone way down but I
still don't read all of them if the board is looking clear of spam. (I presume the other mods are the same.) If you want to attract out attention
then a specific message will work better.
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
It could, but doesn't. The main reason is that I haven't seen enough of those types of posts to come up with a good method of identifying them. A
spam post would have to stay up long enough for me to write the code to detect it and then test that code on it to make sure that everything works
properly. By the time that all could take place, the spam is typically gone, thanks to the quick action of a mod.
I've noticed that spam reports are about 10x as frequent when the script has been temporarily taken down, so I'm assuming that the occasional spam
that appends itself to other threads is a pretty minor problem. I could come up with rules to use to eliminate it easily enough, but a few of those
posts would need to stay up long enough to make sure those rules work to identify them. I'll leave it up to others to determine whether it's worth a
coordinated effort to address.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
Mr. Rogers
Hazard to Others
Posts: 184
Registered: 30-10-2017
Location: Ammonia Avenue
Member Is Offline
Mood: No Mood
|
|
Does the sign-up system use CAPTCHAs? I don't recall. These posts don't look like real people. It's like generic Viagra SPAM.
[Edited on 21-10-2018 by Mr. Rogers]
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Quote: Originally posted by Mr. Rogers | Does the sign-up system use CAPTCHAs? I don't recall. These posts don't look like real people. It's like generic Viagra SPAM.
[Edited on 21-10-2018 by Mr. Rogers] | No, no antispam features in registration.
[Edited on 181021 by fusso]
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
SPAMMERS HAVE EVOLVED
SPAMMERS HAVE EVOLVED TO PRETEND TO BE CHEMISTS
|
|
Melgar
Anti-Spam Agent
Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline
Mood: Estrified
|
|
My guess is it's just copying text from some other thread. BotKilla would get that one anyway, because it's linking to an unrecognized domain, and it
registered within the last 48 hours.
The first step in the process of learning something is admitting that you don't know it already.
I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
|
|
fusso
International Hazard
Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline
|
|
Quote: Originally posted by Melgar |
My guess is it's just copying text from some other thread. BotKilla would get that one anyway, because it's linking to an unrecognized domain, and it
registered within the last 48 hours. | Even if it's only copying, it does seem to know what should it copy
|
|
Pages:
1
..
21
22
23
24
25
..
28 |