Sciencemadness Discussion Board
Not logged in [Login ]
Go To Bottom

Printable Version  
 Pages:  1  ..  21    23    25  ..  28
Author: Subject: Tired of reporting spam
Tsjerk
International Hazard
*****




Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

[*] posted on 9-10-2018 at 20:11


Quote: Originally posted by Magpie  
Would it be possible to have the forum software automatically delete:

1. posts in Russian?


cyrillic you mean?
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 9-10-2018 at 21:40


Ok, my proprietary spam-destroying system "botkilla" has been activated. There was just an insane amount of spam, and even with the threshold turned down pretty low, there wasn't a single false positive. So I'm going to try leaving it on overnight. It runs every three minutes, and so far has gotten all the spam that's been posted in the last few hours. I'll probably tweak the sensitivity tomorrow, if all goes well. In the meantime, if you want to test it, try starting a new thread with the words "mature" or "adult" or "galleries" or Cyrillic characters in the title and see how long it stays up.

I'll make it so that there's additional tests for those cases, but in the meantime, I think most of you will notice a huge drop in the amount of visible spam now. And with the moderators no longer having quite as many spam reports to deal with, we might actually be able to respond to U2U messages, if this actually works.




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
j_sum1
Administrator
********




Posts: 6333
Registered: 4-10-2014
Location: At home
Member Is Offline

Mood: Most of the ducks are in a row

[*] posted on 9-10-2018 at 22:36


Quote: Originally posted by Melgar  
In the meantime, if you want to test it, try starting a new thread with the words "mature" or "adult" or "galleries" or Cyrillic characters in the title and see how long it stays up.

I did just that. here

It seems to be staying up -- it has been longer than 3 mins and there have been no replies in the thread. I will guess that the reason is that it is noit a new registration. (Does your botkilla test for that?) I might try again later with a dummy registration.


[edit]
Ok. I registered with the name, jsumxgwsukspamtester and posted something with a spammy title. It lasted only a couple of minutes before vanishing.
Account is not deleted or banned but the garbage is gone.

Well done, Melgar.
<clap><clap><clap><clap><clap><clap><clap><clap><clap><clap>

[Edited on 10-10-2018 by j_sum1]
View user's profile View All Posts By User
streety
Hazard to Others
***




Posts: 110
Registered: 14-5-2018
Member Is Offline


[*] posted on 10-10-2018 at 04:15


Nice to see some automation. From your description and the behavior of your bot I assume you are not re-using any of the program I sent you.

I can also send you all the spam messages my script has collected over the past few months. Hopefully it will help with training your own implementation. I'll wait for the current batch to be cleared.

[edit]Some of the spam posts have 80+ views. I don't think I've ever seen that before. Melgar, will that be your script?

[Edited on 10-10-2018 by streety]
View user's profile View All Posts By User
streety
Hazard to Others
***




Posts: 110
Registered: 14-5-2018
Member Is Offline


[*] posted on 10-10-2018 at 04:52


The attached file contains a sqlite database file with three tables: topic, member and forum.

This is the definition for the tables:

Code:
from sqlalchemy import Column, Integer, Text, DateTime, ForeignKey from sqlalchemy.orm import relationship from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() class Topic(Base): __tablename__ = 'topic' id = Column(Integer, primary_key=True) num_posts = Column(Integer, ) num_views = Column(Integer, ) first_post = Column(Text, ) topic_title = Column(Text, ) topic_id = Column(Integer, index=True, ) post_date = Column(DateTime) delete_date = Column(DateTime, nullable=True, ) forum_id = Column(Integer, ForeignKey('forum.id')) member_id = Column(Integer, ForeignKey('member.id')) def __init__(self, num_posts, num_views, first_post, topic_title, topic_id, post_date, forum_id, member_id): self.num_posts = num_posts self.num_views = num_views self.first_post = first_post self.topic_title = topic_title self.topic_id = topic_id self.post_date = post_date self.forum_id = forum_id self.member_id = member_id def __repr__(self): return '<Topic %r>' % (self.topic_title) class Member(Base): __tablename__ = 'member' id = Column(Integer, primary_key=True) name = Column(Text, ) register_date = Column(DateTime) deleted_date = Column(DateTime, nullable=True, ) topics = relationship('Topic', backref='member', lazy='dynamic', ) def __init__(self, name, register_date): self.name = name self.register_date = register_date def __repr__(self): return '<Member %r>' % (self.name) class Forum(Base): __tablename__ = 'forum' id = Column(Integer, primary_key=True) forum_id = Column(Integer) name = Column(Text, ) topics = relationship('Topic', backref='forum', lazy='dynamic', ) def __init__(self, forum_id, name): self.forum_id = forum_id self.name = name def __repr__(self): return '<Forum %r>' % (self.name)


If a post has been deleted the delete_date field will be set.

Attachment: db.sqlite.tar.gz (4.8MB)
This file has been downloaded 755 times

View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 10-10-2018 at 07:02


I'm posting a more detailed description of what botkilla looks for in Whimsy, but I'll mention here that the script does keep logs of every post it deletes. Here's the one j_sum1 seems to have made:

Code:
title: mature spam poke out username: jsumxgwsukspamtester replies: 0 last_poster: jsumxgwsukspamtester tid: 96794 spam_score: 12 flags: - spam words in title - registered since yesterday thread_text: "Just a quick test of the new botkilla<br />\r\nj_sum1"


Here's the link to a longer description of what it does, for whoever can access Whimsy:

http://www.sciencemadness.org/talk/viewthread.php?tid=96884

I think I'm going to have to have to implement all the rules manually, just because I want to lower the chances of false positives as much as possible.

[Edited on 10/10/18 by Melgar]




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
JJay
International Hazard
*****




Posts: 3440
Registered: 15-10-2015
Member Is Offline


[*] posted on 10-10-2018 at 07:13


Modern antispam software has very little trouble with false positives. Quit making excuses; if you're incompetent, turn in your resignation.

Untitled.png - 266kB




View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 10-10-2018 at 07:19


Yes, I took the script down for like a half hour while I worked on it, and that's what's accumulated during that time period. Look at the times on all of those posts.



The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 13-10-2018 at 15:29


0 posts?

Sciencemadness Discussion Board - Rudi Davie chose Royalcolleges Team - Powered .png - 15kB




View user's profile View All Posts By User
j_sum1
Administrator
********




Posts: 6333
Registered: 4-10-2014
Location: At home
Member Is Offline

Mood: Most of the ducks are in a row

[*] posted on 13-10-2018 at 16:31


That's how much post count really means.
View user's profile View All Posts By User
Tsjerk
International Hazard
*****




Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

[*] posted on 14-10-2018 at 09:00


@fusso; what do you mean with that post? It is a guy who registered in 2011 and he didn't post anything. So what?

@Melgar; nice work! No more spam for at least a couple of days!

[Edited on 14-10-2018 by Tsjerk]
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 14-10-2018 at 09:19


Quote: Originally posted by Tsjerk  
@fusso; what do you mean with that post? It is a guy who registered in 2011 and he didn't post anything. So what?
I said something wrong?!:O



View user's profile View All Posts By User
Tsjerk
International Hazard
*****




Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

[*] posted on 14-10-2018 at 09:54


No, nothing wrong, I just don't understand what you mean with that screenshot.

Ah ok, it is a post but his count is on zero, well, I wouldn't worry too much about it.

[Edited on 14-10-2018 by Tsjerk]
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 14-10-2018 at 10:22


Quote: Originally posted by Tsjerk  
No, nothing wrong, I just don't understand what you mean with that screenshot.

Ah ok, it is a post but his count is on zero, well, I wouldn't worry too much about it.

[Edited on 14-10-2018 by Tsjerk]
My date format is yy/mm/dd, so that bot is registered in 2018, not 2011:P

The screenshot shows the bot's spam post but its post count is 0, and these 2 observations contradict each other.

[Edited on 18/10/14 by fusso]




View user's profile View All Posts By User
Tsjerk
International Hazard
*****




Posts: 3032
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

[*] posted on 14-10-2018 at 18:57


Ah, then I understand why I didn't get it

Eidt: All spam is disappearing in minutes!

[Edited on 15-10-2018 by Tsjerk]
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 16-10-2018 at 08:37


Nice! Glad it's working.

It probably catches a new spam post every 5-10 minutes. There was a bug in my script that would make it stop running if several unlikely things happened at once. It's fixed now though, so that shouldn't happen anymore.

I don't know why that user has no posts, but I recognize that user as a spammer, since I've been periodically checking the logs. And I'm not going to worry about a spammer's post count. :P




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
phlogiston
International Hazard
*****




Posts: 1379
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline

Mood: pyrophoric

[*] posted on 16-10-2018 at 14:40


Melgar, THANK YOU! It is such a joy to find nearly only non-spam in 'today's post' and browse for interesting contributions again.



-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 18-10-2018 at 16:05


@Melgar
Can your program detect spam posts in other threads?




View user's profile View All Posts By User
j_sum1
Administrator
********




Posts: 6333
Registered: 4-10-2014
Location: At home
Member Is Offline

Mood: Most of the ducks are in a row

[*] posted on 18-10-2018 at 20:06


Quote: Originally posted by fusso  
@Melgar
Can your program detect spam posts in other threads?

No.
Report those. Or, preferably, send a mod a u2u. These will need a manual deletion. The volume of "reported post" u2us has gone way down but I still don't read all of them if the board is looking clear of spam. (I presume the other mods are the same.) If you want to attract out attention then a specific message will work better.
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 18-10-2018 at 20:26


Quote: Originally posted by fusso  
@Melgar
Can your program detect spam posts in other threads?

It could, but doesn't. The main reason is that I haven't seen enough of those types of posts to come up with a good method of identifying them. A spam post would have to stay up long enough for me to write the code to detect it and then test that code on it to make sure that everything works properly. By the time that all could take place, the spam is typically gone, thanks to the quick action of a mod.

I've noticed that spam reports are about 10x as frequent when the script has been temporarily taken down, so I'm assuming that the occasional spam that appends itself to other threads is a pretty minor problem. I could come up with rules to use to eliminate it easily enough, but a few of those posts would need to stay up long enough to make sure those rules work to identify them. I'll leave it up to others to determine whether it's worth a coordinated effort to address.




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
Mr. Rogers
Hazard to Others
***




Posts: 184
Registered: 30-10-2017
Location: Ammonia Avenue
Member Is Offline

Mood: No Mood

[*] posted on 20-10-2018 at 20:27


Does the sign-up system use CAPTCHAs? I don't recall. These posts don't look like real people. It's like generic Viagra SPAM.

[Edited on 21-10-2018 by Mr. Rogers]
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 21-10-2018 at 09:01


Quote: Originally posted by Mr. Rogers  
Does the sign-up system use CAPTCHAs? I don't recall. These posts don't look like real people. It's like generic Viagra SPAM.

[Edited on 21-10-2018 by Mr. Rogers]
No, no antispam features in registration.

[Edited on 181021 by fusso]




View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


mad.gif posted on 22-10-2018 at 09:45
SPAMMERS HAVE EVOLVED


SPAMMERS HAVE EVOLVED TO PRETEND TO BE CHEMISTS:mad::mad::mad:

Sciencemadness Discussion Board - Powered by XMB 1.9.11.png - 9kBSciencemadness Discussion Board - Do electronic balances round the mass off or d.png - 15kB




View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 24-10-2018 at 10:08


Quote: Originally posted by fusso  
SPAMMERS HAVE EVOLVED TO PRETEND TO BE CHEMISTS:mad::mad::mad:

My guess is it's just copying text from some other thread. BotKilla would get that one anyway, because it's linking to an unrecognized domain, and it registered within the last 48 hours.




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 24-10-2018 at 10:32


Quote: Originally posted by Melgar  
Quote: Originally posted by fusso  
SPAMMERS HAVE EVOLVED TO PRETEND TO BE CHEMISTS:mad::mad::mad:

My guess is it's just copying text from some other thread. BotKilla would get that one anyway, because it's linking to an unrecognized domain, and it registered within the last 48 hours.
Even if it's only copying, it does seem to know what should it copy



View user's profile View All Posts By User
 Pages:  1  ..  21    23    25  ..  28

  Go To Top