The BitWorm SEO Blog

October 18, 2007

Is Microsoft Live Search stuffing our log files?

Filed under: Microsoft, SEO — admin @ 8:39 am

On one of the large sites I SEO and monitor traffic for, we’ve seen a large bump in search.live.com traffic over the past few months. The keywords used don’t always make sense for the page being landed on, and the keyword phrase is always a single word.

I’ve assumed (shame on me?) that it was a bug somewhere related to search.live.com that is only allowing the first keyword through to our WebTrends Analytics tool. Andrew Urquhart has commented on his similar theory for the problem.

For yesterday’s stats, MSN had a significantly large volume of traffic that just didn’t seem to make any sense. If it’s wrong, it’s no longer statistically background noise, but rather seriously impactful data that puts the legitimacy of the log data into question.

Looking for answers, I directed my browser over to WebMasterWorld to check out the MSN forum. The Strange Referrer Activity post was right near the top. Among other things, this post on Sept. 5th 2007 was interesting, and a major cause for concern:

Thanks for all the feedback on this thread.

First, we appreciate the concerns and issues that have been raised and apologize for any incovenience this might have caused.

Second, we want to explain what this is all about. The traffic you are seeing is part of a quality check we run on selected pages. While we work on
addressing your conerns, we would request that you do not actively block the IP addreses used by this quality check; blocking these IP addresses could prevent your site from being included in the Live Search index.

Please keep the feedback and thoughts coming as we will use this to help improve this process and make sure that it impacts your sites as little as possible.

thanks
- msndude (msd)

That seems to confirm that at least some of this may be bogus traffic. How much is bogus? We have no easy way to know…

That should be very concerning to any webmaster who values the credibility of their log data. If nothing else, that puts thousand of historical search.live.com referrals for the past few months into question.

It concerns me that:

  1. Microsoft seems to not seem to care about messing with the validity of log files for websites on a global scale.
  2. Microsoft is making it look like websites are getting human traffic from search.live.com which in fact may be a Microsoft bot. I heard the phrase “fake it until you make it”, but that should not be a valid search engine market share tactic!
  3. There is no obvious and/or official way to correctly cleans the corrupted log files of this bogus traffic. (Not to mention the time I need to spend to do it.)
  4. Microsoft has mostly been silent on this issue.
  5. The bot is ignoring the robot.txt rules, and does not identify itself as a bot.

I don’t follow MSN very much since they don’t send very much traffic, so I don’t really know where to look for these things, but I’ve found no official mention of this other than the WebmasterWorld post.

Frankly, this seems to be to be very unethical of Microsoft on a number of levels. For one thing, many website opperators probably think they actually ARE getting an increase in traffic from search.live.com.

If you run a Website, this should concern you greatly. We need answers and solutions. Please comment if you have either.

10 Comments »

  1. I’ve seen the same thing on my sites. One word referrals from MSN that do not make sense.

    Comment by Pocket SEO — October 18, 2007 @ 9:15 am

  2. [...] noticing a lot of strange one-word referrers from MSN Live Search in my logs. Today I saw a post by BitWorm that reveals this as referrer spam-like behavior from [...]

    Pingback by My Worst Referrer Spam: MSN Live.com - Pocket SEO — October 18, 2007 @ 12:32 pm

  3. Hello…Man i just love your blog, keep the cool posts comin..holy Monday

    Comment by Cobracam — November 12, 2007 @ 8:31 pm

  4. Maybe it is a distributed denial of service attack against live.com search engine.

    My stats show 85% of one of my customers’ web site traffic coming from livebot-65-55-214-156.search.live.com .

    The steady increase in traffic fits the denial of service attack profile, albeit on a slow time frame, but the increase grows slowly yet surely for this one customer. And this is not bogus traffic — my server, which has 100 or so customers on it, has had it’s traffic double in the last few months, from 170 kilobits per second to 340 kilobits per second roughly.

    Comment by Jim — December 3, 2007 @ 8:02 pm

  5. I would take a closer look at the referrals from live.com. I’ve noticed something similar, but when I look at the logs, I see this:
    #grep live.com access-log
    65.55.165.13 - - [13/Nov/2007:10:43:20 -0500] “GET /forum/index.php?action=unread;board=12.0 HTTP/1.0″ 200 13753 “http://search.live.com/results.aspx?q=login&mrt=en-us&FORM=LIVSOP” “Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)”
    65.55.165.38 - - [13/Nov/2007:13:32:25 -0500] “GET /forum/index.php?action=help;page=pm HTTP/1.0″ 200 24056 “http://search.live.com/results.aspx?q=personal&mrt=en-us&FORM=LIVSOP” “Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)”

    Note that the IP address is in the proximity of the MSN spiders. Each one of these requests looks like a complete hit from a normal browser - loads images, css, etc.

    I’ve been watching this for a few weeks and what appears to be happening is that MSN is performing *full* page view for *each* keyword that is associated with the site, in such a way that the keyword looks like it’s in the top results. It’s been happening on my sites for a few weeks, though it’s winding down today.

    I suspect the MSN is trying to clean up results by performing some kind of intelligent analysis of keyword referrals in a way that makes sure they are not being duped by SEO “tricks”.

    That’s my theory anyway.

    Comment by Peon — December 3, 2007 @ 8:07 pm

  6. http://sebastians-pamphlets.com/microsoft-live-search-the-downfall-of-a-tiny-search-engine/

    Seems to be the best explanation out there.

    Comment by Anonymous — December 3, 2007 @ 8:27 pm

  7. [...] Is Microsoft Live Search stuffing our log files? [...]

    Pingback by MSN Comes Clean on Fake Search Traffic : The BitWorm Search Blog — December 5, 2007 @ 2:55 pm

  8. [...] Is Microsoft Live Search stuffing our log files [...]

    Pingback by MSN Live Search Spam BOT Cloaked referrals - Yack Yack SEO — June 21, 2008 @ 2:23 am

  9. [...] came across this article. For some reason I started checking out IP addresses because my users online box had trouble [...]

    Pingback by ALTERNATE-REALITY.NET » Blog Archive » Live Search messing with logs — June 24, 2008 @ 7:04 am

  10. I suspect there’s a simple explanation for all this … just follow the money.

    I think M$ is trying to convince people that their live.search.com is sending you all kinds of visitors … and, of course, that means you should be advertising with them.

    It’s possible that search.live.com is not only showing people the search hits BUT also visits (at least) some (large!) number of pages in that list.

    E.G., when I look at the referrers for the last 3 days for one of our sites, I see entries like:
    - 10 hits : http://search.live.com/results.aspx?q=license
    - 1 hit : http://search.live.com/results.aspx?q=licenses
    - 1 hit : http://search.live.com/results.aspx?q=upgrades
    - 7 hits : http://search.live.com/results.aspx?q=contacts
    … and on and on.

    So, if they’re hitting our pages for these kind of search terms, you can imagine that they’re hitting hundreds of thousands of pages — possibly for each person’s search.

    That way it looks like there are “sooooo many people” visiting your site because they’ve used search.live.com … and, of course, that’d mean you really should be advertising with them.

    When all this started, we suddenly saw referrals go from about 3% of the number of Google search referrals to more than the number of Google-search referrals … all within a few days. Analyzing the resulting traffic from these referrals makes it fairly obvious that they’re not real people visiting.

    In the end, I just attribute this to the normal kind of business behavior that’s consistent for Microsoft. What sleazebags!

    Comment by A Site Admin — October 28, 2008 @ 2:29 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress