Difference between revisions of "MediaWiki talk:Spam-blacklist"

From Mudlet
Jump to navigation Jump to search
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 
The list of regex fragments on this page is used to check URLs.  It doesn't match on words outside of links and has a specialized format for some matching functions, see here: [https://www.mediawiki.org/wiki/Extension:SpamBlacklist|MediaWiki Extension:SpamBlacklist].  
 
The list of regex fragments on this page is used to check URLs.  It doesn't match on words outside of links and has a specialized format for some matching functions, see here: [https://www.mediawiki.org/wiki/Extension:SpamBlacklist|MediaWiki Extension:SpamBlacklist].  
The interesting aspect of this page is that we can opt to Blacklist EVERYTHING using a regex fragment that matches all, and then make use of [[MediaWiki:Spam-whitelist]] to specifically allow domains we know are friendly.  The latter method might end up being the better/easier-to-manage option for us.   
+
The interesting aspect of this page is that we can opt to Blacklist EVERYTHING using a regex fragment that matches all, and then make use of [[MediaWiki:Spam-whitelist]] to specifically allow domains we know are friendly.  The latter method might end up being the better/easier-to-manage option for us.  --[[User:TheFae|TheFae]] ([[User talk:TheFae|talk]]) 20:04, 24 July 2017 (UTC)
  
--[[User:TheFae|TheFae]] ([[User talk:TheFae|talk]]) 20:04, 24 July 2017 (UTC)
+
: Let's see how we go with a blacklist first for 2 weeks... if we get tired, we'll go with a whitelist.
 +
: This page itself won't confuse a search engine though, would it? They'll know we're not trying to spam SEO for these words? --[[User:Vadi|Vadi]] ([[User talk:Vadi|talk]]) 08:43, 25 July 2017 (UTC)
  
Let's see how we go with a blacklist first for 2 weeks... if we get tired, we'll go with a whitelist.
+
:: Most of the content on this page is nonsense and very few words appear in links both to and from the page.  Crawlers are likely to rank this page as having low importance because it lacks a rich link and content structure. 
 +
:: One thing we can do is add the URI of this page to a robots.txt for the wiki domain and instruct crawlers not to index it. This might not prevent them from scanning the file but would inform legit crawlers that the contents aren't intended for indexing and so we aren't trying to key-word or back-link spam with all these domain-like bits on the page. --[[User:TheFae|TheFae]] ([[User talk:TheFae|talk]]) 16:56, 25 July 2017 (UTC)
  
This page itself won't confuse a search engine though, would it? They'll know we're not trying to spam SEO for these words?
+
Can we keep discussion in github issues where we discuss every things? I find it hard to review discussions in multiple places.
 +
Also I find whitelist rather troublesome, as we may hinder legit users from linking domains which we just did not recognise before. --[[User:Kebap|Kebap]] ([[User talk:Kebap|talk]]) 17:33, 30 July 2017 (UTC)
  
--[[User:Vadi|Vadi]] ([[User talk:Vadi|talk]]) 08:43, 25 July 2017 (UTC)
+
:The Forums are where everything can be discussed, Github Issues tracker should remain focused on issues with Mudlet as an Application and its source code.  There is absolutely no difficulty in finding this page, in fact it would be MORE difficult to find discussion related to this special page if you had to go look for it on an entirely different site..  Not to mention the surplus number of "Issues" attached to Mudlet's source code make it look worse than it actually is, imho. 
 +
:I do agree that the Forums or anything else would provide a better format for organized discussion than wiki markup...
  
Most of the content on this page is nonsense and very few words appear in links both to and from the pageCrawlers are likely to rank this page as having low importance because it lacks a rich link and content structure.   
+
:My intention is only to share information about the tools with the others who are going to be responsible for maintaining this wiki in an administrative capacityAll I'm going to say is a whitelist would ultimately be more secure and effectiveIt requires would-be-editors to have some kind of discussion about the links/domains they want to add if the community admins haven't allowed the domain(s) already. I think its better to talk about what and why before adding it rather than just letting people add whatever links they want, and having to chase around spammers. Besides, how many domains does the Mudlet Manual really need to link back to anyways? --[[User:TheFae|TheFae]] ([[User talk:TheFae|talk]]) 21:43, 30 July 2017 (UTC)
 
 
One thing we can do is add the URI of this page to a robots.txt for the wiki domain and instruct crawlers not to index it. This might not prevent them from scanning the file but would inform legit crawlers that the contents aren't intended for indexing and so we aren't trying to key-word or back-link spam with all these domain-like bits on the page.
 
 
 
--[[User:TheFae|TheFae]] ([[User talk:TheFae|talk]]) 16:56, 25 July 2017 (UTC)
 

Latest revision as of 21:43, 30 July 2017

The list of regex fragments on this page is used to check URLs. It doesn't match on words outside of links and has a specialized format for some matching functions, see here: Extension:SpamBlacklist. The interesting aspect of this page is that we can opt to Blacklist EVERYTHING using a regex fragment that matches all, and then make use of MediaWiki:Spam-whitelist to specifically allow domains we know are friendly. The latter method might end up being the better/easier-to-manage option for us. --TheFae (talk) 20:04, 24 July 2017 (UTC)

Let's see how we go with a blacklist first for 2 weeks... if we get tired, we'll go with a whitelist.
This page itself won't confuse a search engine though, would it? They'll know we're not trying to spam SEO for these words? --Vadi (talk) 08:43, 25 July 2017 (UTC)
Most of the content on this page is nonsense and very few words appear in links both to and from the page. Crawlers are likely to rank this page as having low importance because it lacks a rich link and content structure.
One thing we can do is add the URI of this page to a robots.txt for the wiki domain and instruct crawlers not to index it. This might not prevent them from scanning the file but would inform legit crawlers that the contents aren't intended for indexing and so we aren't trying to key-word or back-link spam with all these domain-like bits on the page. --TheFae (talk) 16:56, 25 July 2017 (UTC)

Can we keep discussion in github issues where we discuss every things? I find it hard to review discussions in multiple places. Also I find whitelist rather troublesome, as we may hinder legit users from linking domains which we just did not recognise before. --Kebap (talk) 17:33, 30 July 2017 (UTC)

The Forums are where everything can be discussed, Github Issues tracker should remain focused on issues with Mudlet as an Application and its source code. There is absolutely no difficulty in finding this page, in fact it would be MORE difficult to find discussion related to this special page if you had to go look for it on an entirely different site.. Not to mention the surplus number of "Issues" attached to Mudlet's source code make it look worse than it actually is, imho.
I do agree that the Forums or anything else would provide a better format for organized discussion than wiki markup...
My intention is only to share information about the tools with the others who are going to be responsible for maintaining this wiki in an administrative capacity. All I'm going to say is a whitelist would ultimately be more secure and effective. It requires would-be-editors to have some kind of discussion about the links/domains they want to add if the community admins haven't allowed the domain(s) already. I think its better to talk about what and why before adding it rather than just letting people add whatever links they want, and having to chase around spammers. Besides, how many domains does the Mudlet Manual really need to link back to anyways? --TheFae (talk) 21:43, 30 July 2017 (UTC)