Do Google spiders crawl your draft WordPress posts or those that you trashed?
The answer might surprise you.
In this post I’m going to show you how to prevent Google from crawling WordPress posts that are in draft mode or trashed.
What is Google Crawling and Indexing?
Googlebot is a program that finds new or updated web pages on the Internet. These pages are then evaluated for addition to the Google index based on an algorithmic process.
In this process, Googlebot looks at the content, tags, and various page elements to be compiled into the Google search index.
Sometimes, when Googlebot crawls your website, it finds URLs that don’t actually exist. In reality, they are WordPress draft posts or other posts that you trashed.
Imagine my surprise when I went into Google Webmaster tools to find these four URL errors.
There were 2 posts that were never published but were trashed while in draft mode. In addition, there were 2 posts in draft mode.
Google Webmaster Tools reported these as 404 response codes.
While these type of 404 errors don’t harm your site’s performance in search, search engines are still spending time crawling and indexing non-existent URLs on your site.
Here’s the deal…
To avoid these crawl problems altogether, it’s best to prevent or fix these problems. In the best case scenario, you want the Google Webmaster Crawl Errors report to show no errors.
What is a Draft WordPress Post?
Before we get into two simple ways to prevent draft posts from being crawled by Google, let’s agree on what a draft post is.
In WordPress, when you write a post you can choose one of the following status prior to Publishing:
- Pending review
A draft is a post that you may have started and you’ll get back to at a later time.
3 Ways to Prevent Googlebot from Indexing Blog Post Drafts and Trashed Posts
Don’t wait until Googlebot has already crawled draft or trashed posts to take care of it. If you do, you’ll have to address it in Google Webmaster tools.
1 – If you create WordPress draft posts, you can set the Robots Meta Settings to:
- Apply noindex to this post/page
- Apply nofollow to this post/page
- Apply noarchive to this post/page
If your WordPress theme provides SEO settings, you can find them below the blog post in WordPress.
If you are using an SEO plugin, it may provide these settings as well.
When you turn on these settings, instructions are given to Googlebot NOT to crawl this URL.
2. Create your blog posts using software on your computer instead of directly in WordPress. You can also use Google Docs to create documents that you can later transfer to WordPress once you are happy with them.
Another advantage of NOT keeping drafts in WordPress is what happens if your website crashes and you lose everything? Having those posts in a source program can be a lifesaver.
3. If you trash your WordPress post, be sure to empty the trash right away so the post is not sitting there for Googlebot to find.
Now that you know your draft posts can be crawled by Googlebot, what action will you take to prevent it?
If you are not already using Google Webmaster Tools, head over there and get your account setup so you keep track of the health of your website.