» WordPress » Warning - Search Engines can crawl and index your future wordpress articles

Kevin MuldoonWarning - Search Engines can crawl and index your future wordpress articles

Written by Kevin Muldoon from System0 on May 4, 2007

As i noted last week, i leave next saturday to travel for 4 weeks around beautiful New Zealand :) I will be checking up on the blog as much as i can however clearly i can’t always get connected to the net. Therefore i have been writing posts for when i am away and have so far written about a dozen or so posts for when im travelling. Im hoping to at least double that this week so that there is a post published every day when im away. It’s been frustrating to see posts covered in blogs after i have written about them but there’s nothing i can do about that.

I was checking the users online page yesterday and noticed that google was crawling posts i have written for a few weeks time. I only found this out by chance and i was a bit surprised wordpress doesn’t stop spiders viewing those pages in a similar fashion to discussion forum software.

Warning - Search Engines can crawl and index your future wordpress articles

Notice googlebot indexing a post im publishing on May 22nd 2007.

What does this all mean?

The timestamp is a great feature in my opinion and allows you to keep posting whilst your away from the pc. This can be incredibly handy for when your away the weekend or whatever.

If you are working on a post and you just save it then you will be fine - the page isn’t posted by wordpress and can’t be crawled. However, regardless of the date you publish a document through wordpress, once it’s published search engines can spider it and the pages will be indexed. That means if you publish any document with a future date theres a chance it will be listed in search engines before you want it too.

The reason this happens is because when the publish button is hit wordpress generates the new page for the post as usual. All the future timestamp feature does is delay the post from being shown on your home page and archives so for the average reader they won’t find the post til the date you want them to see it.

For me, this isn’t a huge problem right now. The majority of readers of this blog will just read the posts as normal and if they find the unpublished posts, who cares! :)

However, i would strongly recommend not publishing anything with a future date that you wouldn’t mind it being indexed and listed on google etc - this could be breaking news, site announcements or whatever.

I would still use the timestamp when you need to but bear all of this in mind and remember that unless you edit your robots.txt file to stop spiders crawling certain pages your future posts will be indexed by the search engines.

Good luck,
Kevin

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Bumpzee
  • E-mail this story to a friend!
  • Ma.gnolia
  • Print this article!
  • Reddit
  • StumbleUpon
  • Technorati
  • TwitThis
Written by Kevin Muldoon from System0 on May 4, 2007 | Filed Under WordPress

4 Responses so far | Have Your Say!

  1. Brian Heys  |  May 5th, 2007 at 4:20 pm #

    Brian Heys - Gravatar

    This seems really bizarre to me. I can’t work out how Googlebot can index pages that haven’t yet been published. Unless you’ve linked to them from a live post, how can it possibly find them?

    I’m posting this comment from the garden, by the way. Don’t you just love WiFi? It’s night here in the UK, and a little chilly, so I have a fire going in the garden furnace. I must look pretty strange sat by a roaring fire with the screen of my laptop glowing in the darkness! Kind of prehistoric technology meets 21st century!

  2. new zealand accomadation  |  March 16th, 2008 at 2:48 pm #

    new zealand accomadation - Gravatar

    This is the first I have heard of this! I have to agree with Brian, how can a search engine find the post to index it if there is not a link on a page yet pointing to the newly created page by WP. I can see how the page already exists, but there is no there yet. Curious!

Trackbacks to 'Warning - Search Engines can crawl and index your future wordpress articles'

  1. Brian Heys Writes » Does Google index queued posts?
  2. Napisz dziś, co masz napisać jutro : zielony blogger pl

Leave Feedback

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>