On one of the large sites I SEO and monitor traffic for, we’ve seen a large bump in search.live.com traffic over the past few months. The keywords used don’t always make sense for the page being landed on, and the keyword phrase is always a single word.

I’ve assumed (shame on me?) that it was a bug somewhere related to search.live.com that is only allowing the first keyword through to our WebTrends Analytics tool. Andrew Urquhart has commented on his similar theory for the problem.

For yesterday’s stats, MSN had a significantly large volume of traffic that just didn’t seem to make any sense. If it’s wrong, it’s no longer statistically background noise, but rather seriously impactful data that puts the legitimacy of the log data into question.

Looking for answers, I directed my browser over to WebMasterWorld to check out the MSN forum. The Strange Referrer Activity post was right near the top. Among other things, this post on Sept. 5th 2007 was interesting, and a major cause for concern:

Thanks for all the feedback on this thread.

First, we appreciate the concerns and issues that have been raised and apologize for any incovenience this might have caused.

Second, we want to explain what this is all about. The traffic you are seeing is part of a quality check we run on selected pages. While we work on
addressing your conerns, we would request that you do not actively block the IP addreses used by this quality check; blocking these IP addresses could prevent your site from being included in the Live Search index.

Please keep the feedback and thoughts coming as we will use this to help improve this process and make sure that it impacts your sites as little as possible.

thanks
- msndude (msd)

That seems to confirm that at least some of this may be bogus traffic. How much is bogus? We have no easy way to know…

That should be very concerning to any webmaster who values the credibility of their log data. If nothing else, that puts thousand of historical search.live.com referrals for the past few months into question.

It concerns me that:

  1. Microsoft seems to not seem to care about messing with the validity of log files for websites on a global scale.
  2. Microsoft is making it look like websites are getting human traffic from search.live.com which in fact may be a Microsoft bot. I heard the phrase “fake it until you make it”, but that should not be a valid search engine market share tactic!
  3. There is no obvious and/or official way to correctly cleans the corrupted log files of this bogus traffic. (Not to mention the time I need to spend to do it.)
  4. Microsoft has mostly been silent on this issue.
  5. The bot is ignoring the robot.txt rules, and does not identify itself as a bot.

I don’t follow MSN very much since they don’t send very much traffic, so I don’t really know where to look for these things, but I’ve found no official mention of this other than the WebmasterWorld post.

Frankly, this seems to be to be very unethical of Microsoft on a number of levels. For one thing, many website opperators probably think they actually ARE getting an increase in traffic from search.live.com.

If you run a Website, this should concern you greatly. We need answers and solutions. Please comment if you have either.

Comments

7 Responses to “Is Microsoft Live Search stuffing our log files?”

  1. Pocket SEO on October 18th, 2007 9:15 am

    I’ve seen the same thing on my sites. One word referrals from MSN that do not make sense.

  2. My Worst Referrer Spam: MSN Live.com - Pocket SEO on October 18th, 2007 12:32 pm

    […] noticing a lot of strange one-word referrers from MSN Live Search in my logs. Today I saw a post by BitWorm that reveals this as referrer spam-like behavior from […]

  3. Cobracam on November 12th, 2007 8:31 pm

    Hello…Man i just love your blog, keep the cool posts comin..holy Monday

  4. Jim on December 3rd, 2007 8:02 pm

    Maybe it is a distributed denial of service attack against live.com search engine.

    My stats show 85% of one of my customers’ web site traffic coming from livebot-65-55-214-156.search.live.com .

    The steady increase in traffic fits the denial of service attack profile, albeit on a slow time frame, but the increase grows slowly yet surely for this one customer. And this is not bogus traffic — my server, which has 100 or so customers on it, has had it’s traffic double in the last few months, from 170 kilobits per second to 340 kilobits per second roughly.

  5. Peon on December 3rd, 2007 8:07 pm

    I would take a closer look at the referrals from live.com. I’ve noticed something similar, but when I look at the logs, I see this:
    #grep live.com access-log
    65.55.165.13 - - [13/Nov/2007:10:43:20 -0500] “GET /forum/index.php?action=unread;board=12.0 HTTP/1.0″ 200 13753 “http://search.live.com/results.aspx?q=login&mrt=en-us&FORM=LIVSOP” “Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)”
    65.55.165.38 - - [13/Nov/2007:13:32:25 -0500] “GET /forum/index.php?action=help;page=pm HTTP/1.0″ 200 24056 “http://search.live.com/results.aspx?q=personal&mrt=en-us&FORM=LIVSOP” “Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)”

    Note that the IP address is in the proximity of the MSN spiders. Each one of these requests looks like a complete hit from a normal browser - loads images, css, etc.

    I’ve been watching this for a few weeks and what appears to be happening is that MSN is performing *full* page view for *each* keyword that is associated with the site, in such a way that the keyword looks like it’s in the top results. It’s been happening on my sites for a few weeks, though it’s winding down today.

    I suspect the MSN is trying to clean up results by performing some kind of intelligent analysis of keyword referrals in a way that makes sure they are not being duped by SEO “tricks”.

    That’s my theory anyway.

  6. Anonymous on December 3rd, 2007 8:27 pm
  7. MSN Comes Clean on Fake Search Traffic : The BitWorm Search Blog on December 5th, 2007 2:55 pm

    […] Is Microsoft Live Search stuffing our log files? […]

Leave a Reply