Black Hat SEO and Your RSS Feed

Black Hat SEOLast week, while working FeedBlitz customer support I came across a very interesting and involved case.

The client observed that the daily emails being sent out by FeedBlitz were occasionally missing a post.

As I’ve mentioned before, missing posts are most often caused by cached feeds and occasionally duplicate etags.

This proved not to be the case, and it definitely had nothing to do with a stuck FeedBurner feed.

We also discovered that the On Demand mailing feature did not “see” the missing posts. It was as if the posts did not exist.

When the feed itself was examined, we discovered that it was exceptionally large. The missing posts were being pushed out of the available feed due to the massive amounts of content preceding them. But the posts on the publisher’s web site didn’t seem to be that big.

So the question was: Why was the feed so large when it contained so few articles of seemingly normal content?

Black Hat SEO

As you probably know, your browser interprets HTML code to generate a web page. Feeds are written in XML, which is pretty much a stripped down version of HTML that does not execute iframes or javascript. You can see what the underlying code of a website looks like by hitting Ctrl-U in Chrome and FireFox, at least.

When the source of this particular feed was viewed we found something incredibly interesting – at least to us! It was full of <output> tags. These tags are HTML5 and are meant to be used within forms. However, in this particular case, a plugin was wrapping massive amounts of hidden content in the output tag, content that looked like:

<output><div id=’95_content’ style=’display:none;’><span style=’font-size:small;color:black;’>Some text that I’m not going to include but showed up right here it went on and on and on and included lots of search engine friendly details that wouldn’t show up on the user side of the page</span></div></output>

If you look at the code, take note of the: <div id=’95_content’ style=’display:none;’> near the start.

Shady Deals

Hidden content is a huge red flag to search engines that says something shady may be happening. While black hat seo may not have been the intent of the plugin publisher – the content was generated by a plugin used to curate deals – the effect was massive amounts of hidden content being generated and inserted into the client’s posts, and subsequently their feed. And it also explained why it wasn’t visible on the publisher’s site – it was being hidden from people, but was taking up massive amounts of space, and being read by search engines.

The Fix:

Forms, like the aforementioned iframes and javascript, simply do not work within feeds and email, and so <output> tags are a useless artifact. We have made a change to the code, removing <output> tags and any content contained therein.

We also suggested the client contact the creator of the plugin and mention the generated hidden content.

The Result:

Stripping the content contained within the <output> tag drastically reduced the size of the client’s feed. Once the size of the feed became manageable, all posts were then sent out with the scheduled mailings.

This fix also ensured other FeedBlitz clients will not encounter this problem.

Parting note:

When adding plugins to a site, don’t forget to make sure everything is working as intended. This includes checking the source code of the site, and checking for unexpected links and content.

Here at FeedBlitz, we have a publisher’s best interest in mind; their success drives our success!