Thursday, 4 December 2008

Google Blog Search Fix Coming

As I reported in Google Blog Search Problems, Google seems to have included the full text of blog pages (including blog rolls) in their blog search. I just saw this post which both confirms that Google changed something which is causing the problem and that then intend to fix it.

According to Jeremy Hylton of the Google Blog Search team, they now index the full content of the page. This means that not only do they index the full post even if the blog publishes a partial feed, but it means that they index the non-post parts of the pages as well. This is mostly an improvement, of course, but it’s causing some problems, particularly for people who have alerts set or do searches for references to themselves, their sites, or their brands when any of these are linked to in blogrolls.

The result is that anytime a blog publishes a new post, Google Blogsearch picks up the new page, including the sidebar details. So you may get an alert that there’s a new blog post about you, but when you go check it out, you find the post doesn’t even mention you!

Jeremy says:

We do expect to fix the problem you’re seeing. We’ll use the full page content, but exclude the content that isn’t really part of the post. I’m not sure if we’ll be able to make the change before the end of the year, but we are working on it and are pretty confident that it can be solved.

Glad to know they are going to fix it. Hopefully I can leave all my blog search filters in place and they will stop feeding me a lot of extra stuff.

No comments:

Post a Comment