It’s always a constant challenge how to protect our original content from site scrapers. It’s even harder no that blogs are easier to scrape because of full RSS feeds.
There’s the usual prevention methods most bloggers do — going partial feeds instead of full feeds. I’ve never really got worried with it even though I’m publishing full feeds. However lately, I’ve noticed that the scraper sites (splogs) sometimes even ranked higher than mine which has caused alarm.
Search engines promises publishers their system can intelligently identify the original from the dupes but I don’t think their success success rate is any good either. So, I thought getting a back link from the splogs will solve that dupe issue.
Lately, I’m using the Feed Footer plugin which adds custom footers (copyright, notices, advertisements) to the bottom of blog posts in the RSS feed. I’m sure most of you have seen them already.
However, if that’s not enough, you can try the AntiLeech plugin:
AntiLeech produces a fake set of content especially for them that includes links back to your site and sends it only to them. When they steal this content, it appears online just like normal, except now you’ve turned the tables on them and have provided them with useless content.
AntiLeech can detect a splogger bot using its User-Agent string (an identifier that some bots send when they are collecting data), or by IP address. You can enter a User-Agent or an IP address into the Options panel of your WordPress blog. When a visitor with a qualifying (any checked option on the options page) User-Agent or IP address visits your site, they will see only the generated content. They will see it in your page layout and in your feeds. Anywhere you’re normally outputting content, that’s where the fake content will appear to them.
Regular users whose browsers do not match these strings will see your normal content. RSS aggregators should be able to display your content normally, too.
You can download the plugin here. AntiLeech does not really prevent the splogger bots or the splogger themselves from accessing your site, they can still manually do a copy and paste. Still, you have one less to worry about.
YugaTech.com is the largest and longest-running technology site in the Philippines. Originally established in October 2002, the site was transformed into a full-fledged technology platform in 2005.
How to transfer, withdraw money from PayPal to GCash
Prices of Starlink satellite in the Philippines
Install Google GBox to Huawei smartphones
Pag-IBIG MP2 online application
How to check PhilHealth contributions online
How to find your SIM card serial number
Globe, PLDT, Converge, Sky: Unli fiber internet plans compared
10 biggest games in the Google Play Store
LTO periodic medical exam for 10-year licenses
Netflix codes to unlock hidden TV shows, movies
Apple, Asus, Cherry Mobile, Huawei, LG, Nokia, Oppo, Samsung, Sony, Vivo, Xiaomi, Lenovo, Infinix Mobile, Pocophone, Honor, iPhone, OnePlus, Tecno, Realme, HTC, Gionee, Kata, IQ00, Redmi, Razer, CloudFone, Motorola, Panasonic, TCL, Wiko
Best Android smartphones between PHP 20,000 - 25,000
Smartphones under PHP 10,000 in the Philippines
Smartphones under PHP 12K Philippines
Best smartphones for kids under PHP 7,000
Smartphones under PHP 15,000 in the Philippines
Best Android smartphones between PHP 15,000 - 20,000
Smartphones under PHP 20,000 in the Philippines
Most affordable 5G phones in the Philippines under PHP 20K
5G smartphones in the Philippines under PHP 16K
Smartphone pricelist Philippines 2024
Smartphone pricelist Philippines 2023
Smartphone pricelist Philippines 2022
Smartphone pricelist Philippines 2021
Smartphone pricelist Philippines 2020
Jomark Osabel says:
I will give this plugin a try. Thanks Yuga.
calvin says:
yung pinoytravelblog parang may ibang sites na kumukuha ng sa content nya. travelhostel ata or something. same na same pati categories. hahaha, parang duplicate ng site pero ibang theme ginamit. sinadya mo ba yun abe?
Abe Olandres says:
@calvin, those are sploggers feeding off the rss.
otoyreyes says:
nice feed abe :)
JC John SESE Cuneta says:
yep. If you are researching, you’ll end up getting sites that doesn’t have the content you are looking for because they just stole it via feeds. I encountered around 12 already that specifically targets Pinoy owned blogs.
Fun to watch, but not fun anymore if you are one of the victims. :p
BrianB says:
How fragile is the intarwebs…
http://www.ohgizmo.com/2008/03/03/the-internet-its-more-tangled-than-you-think/
ms.jane says:
nice post master yuga.
For BrianB. whats the connection of you link? tsk tsk self promoting. Dumadalas na yata style mo na ganyan.
Blogoloco says:
i’ve seen some of my articles from another website actually.
it’s alexa rank is far highr than mine but i dont understand why they have to do that.
ChrisMo says:
My solution would only be Rss feed with SE friendly url’s, I mean urls only… So that there isn’t any real content to scrape, rather a link to an article to the site post. You need to make better posts titles though…
JC John SESE Cuneta says:
@ChrisMo: It’s a good solution, however, for sites and/or blogs whose content are being re-published/syndicated legally by other sites, or members of online newspapers, they heavily rely on Feeds with full post content.
They have no option but to provide it, and secondly, there are feed subscribers who prefers to read the whole content than to visit the site just to read the rest of the post.
It is a war that the Feed/Syndication Community will soon have to face in full force. However, base on my experience and other people’s, RSS-based feeds are mostly the victims, while Atom-based have less victims. To begin with, Atom is a WebStandard, RSS with its endless flavors, is/are not.
jhay says:
Nice tip. I’ll give the plugin a try, my blog has been a victim of splogs since last year. This would help turn the tide against them.
JC John SESE Cuneta says:
They’re doing it for the money. Or for testing their scripts. Other than those two reasons, I don’t see other reasons to be plausible for their kind of actions. ;) At least for me.
karla says:
Thanks for the tip!
Ang daming mga nagiiscrape ng blogs ngayon. Argh! And my rockersworld.com blog is of course, one of those blogs being scraped.