Archive for the 'Search engines' Category

Flash intros are dead and dying

Google Blogoscoped has a screen shot of a new Google feature that allows users to skip flash intro pages. This is cool because flash intro pages are useless time wasting pieces of crap.

The point of an intro page is to mesmerize website visitors with a flashy cover page so they become impressed with your website and your business. This makes little sense to me because regardless of how fancy your intro page may be, the underlying website still either has what the user wants or it does not. If a website is a piece of crap, it is still a piece of crap with an intro page.

Give users the information they are looking for. There is no need to waste someone’s time when they have given you the opportunity to inform, market and sell them something on your website.

User-agent: PRCrawler/Nutch-0.9

This obscure bot popped up on my radar earlier this month. The complete user-agent string is

PRCrawler/Nutch-0.9 (data mining development project; crawler@projectrialto.com)

The description provided in the string contains several clues that this bot is a waste of my bandwidth. First, Nutch is an open source search engine written in Java. ‘Data mining’ is not an exercise to which I am interested in offering my assistance, especially in the form of my server resources. ‘Development’ and ‘project’ are both hints that this crawler is experimental and may do the world no good at all. Here is how the creators of this bot explain its purpose:

Corey,

Project Rialto is a new online security services solution provider that monetizes its infrastructure investment via relevant advertising for its users. We accomplish this in a very unobtrusive and anonymous method. Our bot is crawling in order to understand the contents of web sites our users visit to assist in serving more relevant content.

We are currently in our initial development phases. As Project Rialto approaches its market launch we’ll provide more information about our offering.
We hope this addresses your concerns; please let us know if you have any other questions.

Regards,

Kelvin Edmison
Software Architect
Project Rialto

This loosely translates to, “we scraped your site to serve someone advertisements based on its content.” I found traces of this bot in one of my error database tables, so we are certainly seeing evidence of a development phase. IncrediBILL agrees that this bot will do no good for your site, and has compiled an IP address list in his usual “get lost” fashion.

Here is some robots.txt love from me to you that will block the bot user-agent that hit me:

User-agent: PRCrawler/Nutch-0.9 (data mining development project; crawler@projectrialto.com)
Disallow: /

Using robots.txt exclusion only works for bots that behave properly. Bad bots do not care if you do not want them, and the only way to prevent them from crawling your site is to block the IP addresses the bot uses.

Well, what are you searching for?

Strategies can not be templated. There is no substitute, no automated gimmick that can replicate what can be achieved by people that share a comfortable and open line of communication.

A local luxury-used outfit approached me two weeks ago because they were unsatisified with their website’s page 4 rankings. Today they are front-paging for half of the searches the dealer identified. The difference was no trick. Instead, a five minute conversation and less than five minutes of work on the site. The search engine, Google in this case, has recrawled the site and adjusted its rankings for the first time since the change.

A lack of communication was the reason for the previous positions, not any technical error. This business has an address in a small town that is not easy to spell. More people search for the county name, which is also the name of the closest large city. There was no mention of the county name on the website, and no links to the dealer’s website on the web contained or were close to the name of the county. Search engines do not volunteer this keyword association to websites; they organize information that already exists.

Machines and readymades will always be tools that are as good as they can be manipulated by a person. One of the most important questions that I can ask a dealer is, “What are you searching for?” Every search marketing campaign I begin has to start with at least one keyword phrase.

Live.com converting three times as often

Here’s a screen shot of the Search Engines Goal Conversion tab on Google Analytics for a classifieds site I’ve been marketing since December 2006.  The goal Conducted Search is reached when a visitor to the site searches the vehicles for sale database. The percentages represent the amount of traffic from each search engine that searched for vehicles after arriving at the site.

I believe this is happening because the site was optimized for very few keywords per page, which resulted in something that MSN/Live.com loves: strong title tags. To find this report within your Google Analytics account, click Traffic Sources then Search Engines and then Goal Conversion in the middle of the page.

Yahoo! Autos caught cloaking

The page has clearly been edited since this has gone public, but this is silly. A search engine cloaking pages to rank better in search engines. What gives?

So, they are clearly serving different content to their users and to the search engines. The pages they’re serving to the bots are heavily keyword stuffed. I honestly don’t know where they could cram the word “used cars” on the page again. It’s everywhere.

http://www.agerhart.com/seo-rankings/yahoo-caught-cloaking-will-they-ban-themselves/

Update: Laura Lippay, Yahoo’s SEO Program Manager responds.