Blog posts of '2011' 'September'

RSS
The Importance of Backlinks - Tuesday, September 06, 2011

If you've read anything about or studied Search Engine Optimization, you've come across the term "backlink" at least once. For those of you new to SEO, you may be wondering what a backlink is, and why they are important. Backlinks have become so important to the scope of Search Engine Optimization, that they have become some of the main building blocks to good SEO. In this article, we will explain to you what a backlink is, why they are important, and what you can do to help gain them while avoiding getting into trouble with the Search Engines.

What are "backlinks"? Backlinks are links that are directed towards your website. Also knows as Inbound links (IBL's). The number of backlinks is an indication of the popularity or importance of that website. Backlinks are important for SEO because some search engines, especially Google, will give more credit to websites that have a good number of quality backlinks, and consider those websites more relevant than others in their results pages for a search query.

When search engines calculate the relevance of a site to a keyword, they consider the number of QUALITY inbound links to that site. So we should not be satisfied with merely getting inbound links, it is the quality of the inbound link that matters.
A search engine considers the content of the sites to determine the QUALITY of a link. When inbound links to your site come from other sites, and those sites have content related to your site, these inbound links are considered more relevant to your site. If inbound links are found on sites with unrelated content, they are considered less relevant. The higher the relevance of inbound links, the greater their quality.

For example, if a webmaster has a website about how to rescue orphaned kittens, and received a backlink from another website about kittens, then that would be more relevant in a search engine's assessment than say a link from a site about car racing. The more relevant the site is that is linking back to your website, the better the quality of the backlink.

Search engines want websites to have a level playing field, and look for natural links built slowly over time. While it is fairly easy to manipulate links on a web page to try to achieve a higher ranking, it is a lot harder to influence a search engine with external backlinks from other websites. This is also a reason why backlinks factor in so highly into a search engine's algorithm. Lately, however, a search engine's criteria for quality inbound links has gotten even tougher, thanks to unscrupulous webmasters trying to achieve these inbound links by deceptive or sneaky techniques, such as with hidden links, or automatically generated pages whose sole purpose is to provide inbound links to websites. These pages are called link farms, and they are not only disregarded by search engines, but linking to a link farm could get your site banned entirely.

Another reason to achieve quality backlinks is to entice visitors to come to your website. You can't build a website, and then expect that people will find your website without pointing the way. You will probably have to get the word out there about your site. One way webmasters got the word out used to be through reciprocal linking. Let's talk about reciprocal linking for a moment.

There is much discussion in these last few months about reciprocal linking. In the last Google update, reciprocal links were one of the targets of the search engine's latest filter. Many webmasters had agreed upon reciprocal link exchanges, in order to boost their site's rankings with the sheer number of inbound links. In a link exchange, one webmaster places a link on his website that points to another webmasters website, and vice versa. Many of these links were simply not relevant, and were just discounted. So while the irrelevant inbound link was ignored, the outbound links still got counted, diluting the relevancy score of many websites. This caused a great many websites to drop off the Google map.

We must be careful with our reciprocal links. There is a Google patent in the works that will deal with not only the popularity of the sites being linked to, but also how trustworthy a site is that you link to from your own website. This will mean that you could get into trouble with the search engine just for linking to a bad apple. We could begin preparing for this future change in the search engine algorithm by being choosier with which we exchange links right now. By choosing only relevant sites to link with, and sites that don't have tons of outbound links on a page, or sites that don't practice black-hat SEO techniques, we will have a better chance that our reciprocal links won't be discounted.

Many webmasters have more than one website. Sometimes these websites are related, sometimes they are not. You have to also be careful about interlinking multiple websites on the same IP. If you own seven related websites, then a link to each of those websites on a page could hurt you, as it may look like to a search engine that you are trying to do something fishy. Many webmasters have tried to manipulate backlinks in this way; and too many links to sites with the same IP address is referred to as backlink bombing.

One thing is certain: interlinking sites doesn't help you from a search engine standpoint. The only reason you may want to interlink your sites in the first place might be to provide your visitors with extra resources to visit. In this case, it would probably be okay to provide visitors with a link to another of your websites, but try to keep many instances of linking to the same IP address to a bare minimum. One or two links on a page here and there probably won't hurt you.

There are a few things to consider when beginning your backlink building campaign. It is helpful to keep track of your backlinks, to know which sites are linking back to you, and how the anchor text of the backlink incorporates keywords relating to your site. A tool to help you keep track of your backlinks is the Domain Stats Tool. This tool displays the backlinks of a domain in Google, Yahoo, and MSN. It will also tell you a few other details about your website, like your listings in the Open Directory, or DMOZ, from which Google regards backlinks highly important; Alexa traffic rank, and how many pages from your site that have been indexed, to name just a few.

Another tool to help you with your link building campaign is the Backlink Builder Tool. It is not enough just to have a large number of inbound links pointing to your site. Rather, you need to have a large number of QUALITY inbound links. This tool searches for websites that have a related theme to your website which are likely to add your link to their website. You specify a particular keyword or keyword phrase, and then the tool seeks out related sites for you. This helps to simplify your backlink building efforts by helping you create quality, relevant backlinks to your site, and making the job easier in the process.

There is another way to gain quality backlinks to your site, in addition to related site themes: anchor text. When a link incorporates a keyword into the text of the hyperlink, we call this quality anchor text. A link's anchor text may be one of the under-estimated resources a webmaster has. Instead of using words like "click here" which probably won't relate in any way to your website, using the words "Please visit our tips page for how to nurse an orphaned kitten" is a far better way to utilize a hyperlink. A good tool for helping you find your backlinks and what text is being used to link to your site is the Backlink Anchor Text Analysis Tool. If you find that your site is being linked to from another website, but the anchor text is not being utilized properly, you should request that the website change the anchor text to something incorporating relevant keywords. This will also help boost your quality backlinks score.

Building quality backlinks is extremely important to Search Engine Optimization, and because of their importance, it should be very high on your priority list in your SEO efforts. We hope you have a better understanding of why you need good quality inbound links to your site, and have a handle on a few helpful tools to gain those links.

Comments (0)
The Age of a Domain Name - Tuesday, September 06, 2011

One of the many factors in Google's search engine algorithm is the age of a domain name. In a small way, the age of a domain gives the appearance of longevity and therefore a higher relevancy score in Google.

Driven by spam sites which pop up and die off quickly, the age of the domain is usually a sign whether or not a site is yesterday's news or tomorrow's popular site. We see this in the world of business, for example. While the novelty that may go with a new store in town brings a short burst of initial business, people tend to trust a business that has been around for a long time over one that is brand new. The same is true for websites. Or, as Rob from BlackwoodProductions.com says, "Rent the store (i.e. register the domain) before you open for business".

Two things that are considered in the age of a domain name are:

  • The age of the website
  • The length of time a domain has been registered

The age of the website is built up of how long the content has been actually on the web, how long the site has been in promotion, and even the last time content was updated. The length of time a domain has been registered is measured by not only the actual date the domain was registered, but also how long it is registered for. Some domains only register for a year at a time, while others are registered for two, five, or even ten years.

In the latest Google update that SEOs call the Jagger Update, some of the big changes seen were the importance given to age; age of incoming links, age of web content, and the date the domain was registered. There were many things, in reality, that were changed in this last update, but since we're talking about the age of a domain, we'll only deal with those issues specifically. We'll talk more in other articles about other factors you will want to be aware of that Google changed in their evaluation criteria of websites on the Internet.

One of the ways Google uses to minimize search engine spam is by giving new websites a waiting period of three to four months before giving it any kind of PageRank. This is referred to as the "sandbox effect". It's called the "sandbox effect" because it has been said that Google wants to see if those sites are serious about staying around on the web. The sandbox analogy comes from the concept that Google does this by throwing all of the new sites into a sandbox and let them play together, away from all the adults. Then, when those new sites "grow up", so to speak, then they are allowed to be categorized with the "adults", or the websites that aren't considered new.

What does this mean to you? For those of you with new websites, you may be disappointed in this news, but don't worry. There are some things you can do while waiting for the sandbox period to expire, such as concentrating on your backlink strategies, promoting your site through Pay-per-click, articles, RSS feeds, or in other ways. Many times, if you spend this sandbox period wisely, you'll be ready for Google when it does finally assign you a PageRank, and you could find yourself starting out with a great PageRank!

Even though the domain's age is a factor, critics believe it only gets a little weight in the algorithm. Since the age of your domain is something you have no control over, it doesn't necessarily mean that your site isn't going to rank well in the Search Engine Results Pages (SERPs). It does mean, however, that you will have to work harder in order to build up your site popularity and concentrate on factors that you can control, link inbound links and the type of content you present on your website.

So what happens if you change your domain name? Does this mean you're going to get a low grade with a search engine if you have a new site? No, not necessarily. There are a few things you can do to help ensure that your site won't get lost in the SERPs because of the age of the domain.

1. Make sure you register your domain name for the longest amount of time possible. Many registrars allow you to register a domain name for as long as five years, and some even longer. Registering your domain for a longer period of time gives an indication that your site intends to be around for a long time, and isn't going to just disappear after a few months. This will help boost your score with regards to your domain's age.

2. Consider registering a domain name even before you are sure you're going to need it. We see many domains out there that even while they are registered; they don't have a website to go with it. This could mean that the site is in development, or simply someone saw the use of that particular domain name, and wanted to snatch it up before someone else did. There doesn't seem to be any problems with this method so far, so it certainly can't hurt you to buy a domain name you think could be catchy, even if you end up just selling it later on.

3. Think about purchasing a domain name that was already pre-owned. Not only will this allow you to avoid the "sandbox effect" of a new website in Google, but it also allows you to keep whatever PageRank may have already been attributed to the domain. Be aware that most pre-owned domains with PageRank aren't as cheaply had as a new domain, but it might be well worth it to you to invest a bit more money right at the start.

4. Keep track of your domain's age. One of the ways you can determine the age of a domain is with this handy Domain Age Tool. What it does is allows you to view the approximate age of a website on the Internet, which can be very helpful in determining what kind of edge your competitors might have over you, and even what a site might have looked like when it first started.

To use it, simply type in the URL of your domain and the URLs of your competitors, and click submit. This will give you the age of the domains and other interesting information, like anything that had been cached from the site initially. This could be especially helpful if you are purchasing a pre-owned domain.

Because trustworthy sites are going to have to be the wave of the future, factoring in the age of a domain is a good idea. Even though a site that may have been around for years may suddenly go belly-up, or the next big eBay or Yahoo! just might be getting it start, it may not be a full measure of how trustworthy a site is or will be. This is why there are many other factors that weigh into a search engine's algorithm and not just a single factor alone. What we do know is that we've seen age becoming of more importance that it had been previously, there are only good things to be said about having a site that's been around for a while.

Comments (0)
Ranking in Country Specific Search Engines - Tuesday, September 06, 2011

In the world of Search Engine Optimization, Location is important. Search engines like to bring relevant results to a user, not only in the area of keywords and sites that give the user exactly what they are looking for, but also in the correct language as well. It doesn't do a lot of good for a Russian-speaking individual to continually get websites returned in a search query that are written in Egyptian or in Chinese. So a search engine has to have some way to be able to return the results the user is looking for in the right language, and a search engine's goal is also to try and get the user as close to home as possible in the realm of their search results.

Many people wonder why their websites don't rank well in some search engines, especially if they are trying to get ranked in a search engine based in another country. Perhaps they may not even know they are in another country? You say that is impossible: how could one not know what country they are in? It might surprise that individual to find that their website might in fact be hosted in a completely different country, perhaps even on another continent!

Consider that many search engines, including Google, will determine country not only based on the domain name (like .co.uk or .com.au), but also the country of a website's physical location based upon IP address. Search engines are programmed with information that tells them which IP addresses belong to which particular country, as well as which domain suffixes are assigned to which countries.

Let's say, for instance, that you are wishing to rank highly in Google based in the United States. It would not do well, then, for you to have your website hosted in Japan or Australia. You might have to switch your web host to one whose servers reside in the United States.

There is a tool we like to use called the Website to Country Tool. What this tool does is it allows you to view which country your website is hosted. Not only will this tell you what country your site is hosted in, but it can also help you determine a possible reason why your website may not be ranking as highly as you might like in a particular search engine.

It might be disheartening to learn that your website has been hosted in another country, but it is better to understand why your site might not be ranking as highly as you'd like it to be, especially when there is something you can definitely do about it.

Comments (0)
Optimization, Over-Optimization or SEO Overkill? - Tuesday, September 06, 2011

The fight to top search engines' results knows no limits – neither ethical, nor technical. There are often reports of sites that have been temporarily or permanently excluded from Google and the other search engines because of malpractice and using “black hat” SEO optimization techniques. The reaction of search engines is easy to understand – with so many tricks and cheats that SEO experts include in their arsenal, the relevancy of returned results is seriously compromised to the point where search engines start to deliver completely irrelevant and manipulated search results. And even if search engines do not discover your scams right away, your competitors might report you.

Keyword Density or Keyword Stuffing?

Sometimes SEO experts go too far in their desire to push their clients' sites to top positions and resort to questionable practices, like keyword stuffing. Keyword stuffing is considered an unethical practice because what you actually do is use the keyword in question throughout the text suspiciously often. Having in mind that the recommended keyword density is from 3 to 7%, anything above this, say 10% density starts to look very much like keyword stuffing and it is likely that will not get unnoticed by search engines. A text with 10% keyword density can hardly make sense, if read by a human. Some time ago Google implemented the so called “Florida Update” and essentially imposed a penalty for pages that are keyword-stuffed and over-optimized in general.

Generally, keyword density in the title, the headings, and the first paragraphs matters more. Needless to say that you should be especially careful not to stuff these areas. Try the Keyword Density Cloud tool to check if your keyword density is in the acceptable limits, especially in the above-mentioned places. If you have a high density percentage for a frequently used keyword, then consider replacing some of the occurrences of the keyword with synonyms. Also, generally words that are in bold and/or italic are considered important by search engines but if any occurrence of the target keywords is in bold and italic, this also looks unnatural and in the best case it will not push your page up.

Doorway Pages and Hidden Text

Another common keyword scam is doorway pages. Before Google introduced the PageRank algorithm, doorways were a common practice and there were times when they were not considered an illegal optimization. A doorway page is a page that is made especially for the search engines and that has no meaning for humans but is used to get high positions in search engines and to trick users to come to the site. Although keywords are still very important, today keywords alone have less effect in determining the position of a site in search results, so doorway pages do not get so much traffic to a site but if you use them, don't ask why Google punished you.

Very similar to doorway pages was a scam called hidden text. This is text, which is invisible to humans (e.g. the text color is the same as the page background) but is included in the HTML source of the page, trying to fool search engines that the particular page is keyword-rich. Needless to say, both doorway pages and hidden text can hardly be qualified as optimization techniques, there are more manipulation than everything else.

Duplicate Content

It is a basic SEO rule that content is king. But not duplicate content. In terms of Google, duplicate content means text that is the same as the text on a different page on the SAME site (or on a sister-site, or on a site that is heavily linked to the site in question and it can be presumed that the two sites are related) – i.e. when you copy and paste the same paragraphs from one page on your site to another, then you might expect to see your site's rank drop. Most SEO experts believe that syndicated content is not treated as duplicate content and there are many examples of this. If syndicated content were duplicate content, that the sites of news agencies would have been the first to drop out of search results. Still, it does not hurt to check from time if your site has duplicate content with another, at least because somebody might be illegally copying your content and you do not know. The Similar Page Checker tool will help you see if you have grounds to worry about duplicate content.

Links Spam

Links are another major SEO tool and like the other SEO tools it can be used or misused. While backlinks are certainly important (for Yahoo backlinks are important as quantity, while for Google it is more important what sites backlinks come from), getting tons of backlinks from a link farm or a blacklisted site is begging to be penalized. Also, if outbound links (links from your site to other sites) considerably outnumber your inbound links (links from other sites to your site), then you have put too much effort in creating useless links because this will not improve your ranking. You can use the Domain Stats Tool to see the number of backlinks (inbound links) to your site and the Site Link Analyzer to see how many outbound links you have.

Using keywords in links (the anchor text), domain names, folder and file names does boost your search engine rankings but again, the precise measure is the boundary between topping the search results and being kicked out of them. For instance, if you are optimizing for the keyword “cat”, which is a frequently chosen keyword and as with all popular keywords and phrases, competition is fierce, you might not see other alternative for reaching the top but getting a domain name like http://www.cat-cats-kittens-kitty.com, which no doubt is packed with keywords to the maximum but is first – difficult to remember, and second – if the contents does not correspond to the plenitude of cats in the domain name, you will never top the search results.

Although file and folder names are less important than domain names, now and then (but definitely not all the time) you can include “cat” (and synonyms) in them and in the anchor text of the links. This counts well, provided that anchors are not artificially stuffed (for instance if you use “cat_cats_kitten” as anchor for internal site links this anchor certainly is stuffed). While you have no control over third sides that link to you and use anchors that you don't like, it is up to you to perform periodic checks what anchors do other sites use to link to you. A handy tool for this task is the Backlink Anchor Text Analysis, where you enter the URL and get a listing of the sites that link to you and the anchor text they use.

Finally, to Google and the other search engines it makes no difference if a site is intentionally over-optimized to cheat them or over-optimization is the result of good intentions, so no matter what your motives are, always try to keep to reasonable practices and remember that do not overstep the line.

Comments (0)
See Your Site With the Eyes of a Spider - Tuesday, September 06, 2011
Making efforts to optimize a site is great but what counts is how search engines see your efforts. While even the most careful optimization does not guarantee tops position in search results, if your site does not follow basic search engine optimisation truths, then it is more than certain that this site will not score well with search engines. One way to check in advance how your SEO efforts are seen by search engines is to use a search engine simulator.

Spiders Explained

Basically all search engine spiders function on the same principle – they crawl the Web and index pages, which are stored in a database and later use various algorithms to determine page ranking, relevancy, etc of the collected pages. While the algorithms of calculating ranking and relevancy widely differ among search engines, the way they index sites is more or less uniform and it is very important that you know what spiders are interested in and what they neglect.

Search engine spiders are robots and they do not read your pages the way a human does. Instead, they tend to see only particular stuff and are blind for many extras (Flash, JavaScript) that are intended for humans. Since spiders determine if humans will find your site, it is worth to consider what spiders like and what don't.

Flash, JavaScript, Image Text or Frames?!

Flash, JavaScript and image text are NOT visible to search engines. Frames are a real disaster in terms of SEO ranking. All of them might be great in terms of design and usability but for search engines they are absolutely wrong. An incredible mistake one can make is to have a Flash intro page (frames or no frames, this will hardly make the situation worse) with the keywords buried in the animation. Check with the Search Engine Spider Simulator tool a page with Flash and images (and preferably no text or inbound or outbound hyperlinks) and you will see that to search engines this page appears almost blank.

Running your site through this simulator will show you more than the fact that Flash and JavaScript are not SEO favorites. In a way, spiders are like text browsers and they don't see anything that is not a piece of text. So having an image with text in it means nothing to a spider and it will ignore it. A workaround (recommended as a SEO best practice) is to include meaningful description of the image in the ALT attribute of the <IMG> tag but be careful not to use too many keywords in it because you risk penalties for keyword stuffing. ALT attribute is especially essential, when you use links rather than text for links. You can use ALT text for describing what a Flash movie is about but again, be careful not to trespass the line between optimization and over-optimization.

Are Your Hyperlinks Spiderable?

The search engine spider simulator can be of great help when trying to figure out if the hyperlinks lead to the right place. For instance, link exchange websites often put fake links to your site with _javascript (using mouse over events and stuff to make the link look genuine) but actually this is not a link that search engines will see and follow. Since the spider simulator would not display such links, you'll know that something with the link is wrong.

It is highly recommended to use the <noscript> tag, as opposed to _javascript based menus. The reason is that _javascript based menus are not spiderable and all the links in them will be ignored as page text. The solution to this problem is to put all menu item links in the <noscript> tag. The <noscript> tag can hold a lot but please avoid using it for link stuffing or any other kind of SEO manipulation.

If you happen to have tons of hyperlinks on your pages (although it is highly recommended to have less than 100 hyperlinks on a page), then you might have hard times checking if they are OK. For instance, if you have pages that display “403 Forbidden”, “404 Page Not Found” or similar errors that prevent the spider from accessing the page, then it is certain that this page will not be indexed. It is necessary to mention that a spider simulator does not deal with 403 and 404 errors because it is checking where links lead to not if the target of the link is in place, so you need to use other tools for checking if the targets of hyperlinks are the intended ones.

Looking for Your Keywords

While there are specific tools, like the Keyword Playground or the Website Keyword Suggestions, which deal with keywords in more detail, search engine spider simulators also help to see with the eyes of a spider where keywords are located among the text of the page. Why is this important? Because keywords in the first paragraphs of a page weigh more than keywords in the middle or at the end. And if keywords visually appear to us to be on the top, this may not be the way spiders see them. Consider a standard Web page with tables. In this case chronologically the code that describes the page layout (like navigation links or separate cells with text that are the same sitewise) might come first and what is worse, can be so long that the actual page-specific content will be screens away from the top of the page. When we look at the page in a browser, to us everything is fine – the page-specific content is on top but since in the HTML code this is just the opposite, the page will not be noticed as keyword-rich.

Are Dynamic Pages Too Dynamic to be Seen At All

Dynamic pages (especially ones with question marks in the URL) are also an extra that spiders do not love, although many search engines do index dynamic pages as well. Running the spider simulator will give you an idea how well your dynamic pages are accepted by search engines. Useful suggestions how to deal with search engines and dynamic URLs can be found in the Dynamic URLs vs. Static URLs article.

Meta Keywords and Meta Description

Meta keywords and meta description, as the name implies, are to be found in the <META> tag of a HTML page. Once meta keywords and meta descriptions were the single most important criterion for determining relevance of a page but now search engines employ alternative mechanisms for determining relevancy, so you can safely skip listing keywords and description in Meta tags (unless you want to add there instructions for the spider what to index and what not but apart from that meta tags are not very useful anymore).

Comments (0)
Optimizing for Yahoo! - Tuesday, September 06, 2011

Back in the dawn of the Internet, Yahoo! was the most popular search engine. When Google arrived, its indisputably precise search results made it the preferred search engine. However, Google is not the only search engine and it is estimated that about 20-25% or searches are conducted on Yahoo! Another major player on the market is MSN, which means that SEO professionals cannot afford to optimize only for Google but need to take into account the specifics of the other two engines (Yahoo! and MSN) as well.

Optimizing for three search engines at the same time is not an easy task. There were times, when the SEO community was inclined to think that the algorithm of Yahoo! was on deliberately just the opposite to the Google algorithm because pages that ranked high in Google did not do so well in Yahoo! and vice versa. The attempt to optimize a site to appeal to both search engines usually lead to being kicked out of the top of both of them.

Although there is no doubt that the algorithms of the two search engines are different, since both are constantly changing, none of them is made publicly available by its authors and the details about how each of the algorithms function are obtained by speculation based on probe-trial tests for particular keywords, it is not possible to say for certain what exactly is different. What is more, having in mind the frequency with which algorithms are changed, it is not possible to react to every slight change, even if algorithms' details were known officially. But knowing some basic differences between the two does help to get better ranking. A nice visual representation of the differences in positioning between Yahoo! and Google gives the Yahoo vs Google tool.

The Yahoo! Algorithm - Differences With Google

Like all search engines, Yahoo! too spiders the pages on the Web, indexes them in its database and later performs various mathematical operations to produce the pages with the search results. Yahoo! Slurp (the Yahoo! spiderbot) is the the second most active spider crawler on the Web. Yahoo! Slurp is not different from the other bots and if your page misses important elements of the SEO mix that make it not spiderable, then it hardly makes a difference which algorithm will be used because you will never get to a top position. (You may want to try the Search Engine Spider Simulator and check what of your pages is spiderable).

Yahoo! Slurp might be even more active than Googlebot because occasionally there are more pages in the Yahoo! index than in Google. Another alleged difference between Yahoo! and Google is the sandbox (putting the sites “on hold” for some time till they appear in search results). Google's sandbox is deeper, so if you have made recent changes to your site, you might have to wait a month or two (shorter for Yahoo! and longer for Google) till these changes are reflected in the search results.

With new major changes in the Google algorithm under way (the so-called “BigDaddy” Infrastructure expected to be fully launched in March-April 2006) it's hard to tell if the same SEO tactics will be hot on Google in two months' time. One of the supposed changes is the decrease in weight of links. If this happens, a major difference between Yahoo! and Google will be eliminated because as of today Google places more importance on factors such as backlinks, while Yahoo! sticks more to onpage factors, like keyword density in the title, the URL, and the headings.

Of all the differences between Yahoo! and Google, the way keywords in the title and in the URL are treated is the most important. If you have the keyword in these two places, then you can expect a top 10 place in Yahoo!. But beware – a title and an URL cannot be unlimited and technically you can place no more than 3 or 4 keywords there. Also, it matters if the keyword in the title and in the URL is in a basic form or if it is a derivative – e.g. when searching for “cat”, URLs with “catwalk” will also be displayed in Yahoo! but most likely in the second 100 results, while URLs with “cat” only are quite near to the top.

Since Yahoo! is first a directory for submissions and then a search engine (with Google it's just the opposite), a site, which has the keyword in the category it is listed under, stands a better chance to be in the beginning of the search results. With Google this is not that important. For Yahoo! keywords in filenames also score well, while for Google this is not a factor of exceptional importance.

But the major difference is keyword density. The higher the density, the higher the positioning with Yahoo! But beware – some of the keyword-rich sites on Yahoo! can with no difficulty fall into the keyword-stuffed category for Google, so if you attempt to score well on Yahoo! (with keyword density above 7-8%), you risk to be banned by Google!

Yahoo! WebRank

Following Google's example, Yahoo! introduced a Web toolbar that collects anonymous statistics about which sites users browse, thus way getting an aggregated value (from 0 to 10) of how popular a given site is. The higher the value, the more popular a site is and the more valuable the backlinks from it are.

Although WebRank and positioning in the search results are not directly correlated, there is a dependency between them – sites with high WebRank tend to position higher than comparable sites with lower WebRank and the WebRanks of the top 20-30 results for a given keyword are most often above 5.00 on average.

The practical value of WebRank as a measure of success is often discussed in SEO communities and the general opinion is that this is not the most relevant metrics. However, one of the benefits of WebRank is that it alerts Yahoo! Slurp that a new page has appeared, thus inviting it to spider it, if it is not already in the Yahoo! Search index.

When Yahoo! toolbar was launched in 2004, it had an icon that showed the WebRank of the page that is currently open in the browser. Later this feature has been removed but still there are tools on the Web that allow to check the WebRank of a particular page. For instance, this tool allows to check the WebRanks of a whole bunch of pages at a time.

Comments (0)
Jumping Over the Google Sandbox - Tuesday, September 06, 2011

It's never easy for newcomers to enter a market and there are barriers of different kinds. For newcomers to the world of search engines, the barrier is called a sandbox – your site stays there until it gets mature enough to be allowed to the Top Positions club. Although there is no direct confirmation of the existence of a sandbox, Google employees have implied it and SEO experts have seen in practice that new sites, no matter how well optimized, don't rank high on Google, while on MSN and Yahoo they catch quickly. For Google, the jailing in the sandbox for new sites with new domains is on average 6 months, although it can vary from less than a month to over 8 months.

Sandbox and Aging Delay

While it might be considered unfair to stop new sites by artificial means like keeping them at the bottom of search results, there is a fair amount of reasoning why search engines, and above all Google, have resorted to such measures. With blackhat practices like bulk buying of links, creation of duplicate content or simply keyword stuffing to get to the coveted top, it is no surprise that Google chose to penalize new sites, which overnight get tons of backlinks, or which are used as a source of backlinks to support an older site (possibly owned by the same company). Needless to say, when such fake sites are indexed and admitted to top positions, this deteriorates search results, so Google had to take measures for ensuring that such practices will not be tolerated. The sandbox effect works like a probation period for new sites and by making the practice of farming fake sites a long-term, rather than a short-term payoff for site owners, it is supposed to decrease its use.

Sandbox and aging delay are similar in meaning and many SEO experts use them interchangeably. Aging delay is more self-explanatory – sites are “delayed” till they come of age. Well, unlike in legislation, with search engines this age is not defined and it differs. There are cases when several sites were launched in the same day, were indexed within a week from each other but the aging delay for each of them expired in different months. As you see, the sandbox is something beyond your control and you cannot avoid it but still there are steps you can undertake to minimize the damage for new sites with new domains.

Minimizing Sandbox Damages

While Google sandbox is not something you can control, there are certain steps you can take in order to make the sandbox effect less destructive for your new site. As with many aspects of SEO, there are ethical and unethical tips and tricks and unethical tricks can get you additional penalties or a complete ban from Google, so think twice before resorting to them. The unethical approaches will not be discussed in this article because they don comply with our policy.

Before we delve into more detail about particular techniques to minimize sandbox damage, it is necessary to clarify the general rule: you cannot fight the sandbox. The only thing you can do is to adapt to it and patiently wait for time to pass. Any attempts to fool Google – starting from writing melodramatic letters to Google, to using “sandbox tools” to bypass the filter – can only make your situation worse. There are many initiatives you can take, while in the sandbox, for as example:

  • Actively gather content and good links – as time passes by, relevant and fresh content and good links will take you to the top. When getting links, have in mind that they need to be from trusted sources – like DMOZ, CNN, Fortune 500 sites, or other reputable places. Also, links from .edu, .gov, and .mil domains might help because these domains are usually exempt from the sandbox filter. Don't get 500 links a month – this will kill your site! Instead, build links slowly and steadily.

  • Plan ahead– contrary to the general practice of launching a site when it is absolutely complete, launch a couple of pages, when you have them. This will start the clock and time will be running parallel to your site development efforts.

  • Buy old or expired domains – the sandbox effect is more serious for new sites on new domains, so if you buy old or expired domains and launch your new site there, you'll experience less problems.

  • Host on a well- established host – another solution is to host your new site on a subdomain of a well-established host (however, free hosts are generally not a good idea in terms of SEO ranking). The sandbox effect is not so severe for new subdomains (unless the domain itself is blacklisted). You can also host the main site on a subdomain and on a separate domain host just some contents, linked with the main site. You can also use redirects from the subdomained site to the new one, although the effect of this practice is also questionable because it can also be viewed as an attempt to fool Google.

  • Concentrate on less popular keywords – the fact that your site is sandboxed does not mean that it is not indexed by Google at all. On the contrary, you could be able to top the search results from the very beginning! Looking like a contradiction with the rest of the article? Not at all! You could top the results for less popular keywords – sure, it is better than nothing. And while you wait to get to the top for the most lucrative keywords, you can discover that even less popular keywords are enough to keep the ball rolling, so you may want to make some optimization for them.

  • Rely more on non-Google ways to increase traffic – it is often reminded that Google is not the only search engine or marketing tool out there. So if you plan your SEO efforts to include other search engines, which either have no sandbox at all or the period of stay there is relatively short, this will also minimize the damages of the sandbox effect.

Comments (0)
What is Robots.txt - Tuesday, September 06, 2011

Robots.txt

It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want. For instance, if you have two versions of a page (one for viewing in the browser and one for printing), you'd rather have the printing version excluded from crawling, otherwise you risk being imposed a duplicate content penalty. Also, if you happen to have sensitive data on your site that you do not want the world to see, you will also prefer that search engines do not index these pages (although in this case the only sure way for not indexing sensitive data is to keep it offline on a separate machine). Additionally, if you want to save some bandwidth by excluding images, stylesheets and javascript from indexing, you also need a way to tell spiders to keep away from these items.

One way to tell search engines which files and folders on your Web site to avoid is with the use of the Robots metatag. But since not all search engines read metatags, the Robots matatag can simply go unnoticed. A better way to inform search engines about your will is to use a robots.txt file.

What Is Robots.txt?

Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.

The location of robots.txt is very important. It must be in the main directory because otherwise user agents (search engines) will not be able to find it – they do not search the whole site for a file named robots.txt. Instead, they look first in the main directory (i.e. http://mydomain.com/robots.txt) and if they don't find it there, they simply assume that this site does not have a robots.txt file and therefore they index everything they find along the way. So, if you don't put robots.txt in the right place, do not be surprised that search engines index your whole site.

The concept and structure of robots.txt has been developed more than a decade ago and if you are interested to learn more about it, visit http://www.robotstxt.org/ or you can go straight to the Standard for Robot Exclusion because in this article we will deal only with the most important aspects of a robots.txt file. Next we will continue with the structure a robots.txt file.

Structure of a Robots.txt File

The structure of a robots.txt is pretty simple (and barely flexible) – it is an endless list of user agents and disallowed files and directories. Basically, the syntax is as follows:

User-agent:

Disallow:

User-agent” are search engines' crawlers and disallow: lists the files and directories to be excluded from indexing. In addition to “user-agent:” and “disallow:” entries, you can include comment lines – just put the # sign at the beginning of the line:

# All user agents are disallowed to see the /temp directory.

User-agent: *

Disallow: /temp/

The Traps of a Robots.txt File

When you start making complicated files – i.e. you decide to allow different user agents access to different directories – problems can start, if you do not pay special attention to the traps of a robots.txt file. Common mistakes include typos and contradicting directives. Typos are misspelled user-agents, directories, missing colons after User-agent and Disallow, etc. Typos can be tricky to find but in some cases validation tools help.

The more serious problem is with logical errors. For instance:

User-agent: *

Disallow: /temp/

User-agent: Googlebot

Disallow: /images/

Disallow: /temp/

Disallow: /cgi-bin/

The above example is from a robots.txt that allows all agents to access everything on the site except the /temp directory. Up to here it is fine but later on there is another record that specifies more restrictive terms for Googlebot. When Googlebot starts reading robots.txt, it will see that all user agents (including Googlebot itself) are allowed to all folders except /temp/. This is enough for Googlebot to know, so it will not read the file to the end and will index everything except /temp/ - including /images/ and /cgi-bin/, which you think you have told it not to touch. You see, the structure of a robots.txt file is simple but still serious mistakes can be made easily.

Tools to Generate and Validate a Robots.txt File

Having in mind the simple syntax of a robots.txt file, you can always read it to see if everything is OK but it is much easier to use a validator, like this one: http://tool.motoricerca.info/robots-checker.phtml. These tools report about common mistakes like missing slashes or colons, which if not detected compromise your efforts. For instance, if you have typed:

User agent: *

Disallow: /temp/

this is wrong because there is no slash between “user” and “agent” and the syntax is incorrect.

In those cases, when you have a complex robots.txt file – i.e. you give different instructions to different user agents or you have a long list of directories and subdirectories to exclude, writing the file manually can be a real pain. But do not worry – there are tools that will generate the file for you. What is more, there are visual tools that allow to point and select which files and folders are to be excluded. But even if you do not feel like buying a graphical tool for robots.txt generation, there are online tools to assist you. For instance, the Server-Side Robots Generator offers a dropdown list of user agents and a text box for you to list the files you don't want indexed. Honestly, it is not much of a help, unless you want to set specific rules for different search engines because in any case it is up to you to type the list of directories but is more than nothing.

Comments (0)
Bad Neighborhood - Tuesday, September 06, 2011

Has it ever happened to you to have a perfectly optimized site with lots of links and content and the right keyword density and still do not rank high in search engines? Probably every SEO has experienced this. The reasons for such kind of failure can be really diverse – starting from the sandbox effect (your site just needs time to get mature), to overoptimization and inappropriate online relations (i.e. the so called “bad neighborhood” effect).

While there is not much you can do about the sandbox effect but wait, in most other cases it is up to you to counteract the negative effects you are suffering from. You just need to figure out what is stopping you from achieving the deserved rankings. Careful analysis of your site and the sites that link to you can give you ideas where to look for for the source of trouble and deal with it. If it is overoptimization – remove excessive stuffing; if it is bad neighbors – say “goodbye” to them. We have already deals with overoptimization as a SEO overkill and in this article we will have a look at another frequent rankings killer.

Link Wisely, Avoid Bad Neighbors

It is a known fact that one of the most important items for high rankings, especially with Google, are links. The Web is woven out of links and inbound and outbound links are most natural. Generally, the more inbound links (i.e. other sites link to you) you have, the better. On the contrary, if you have many outbound links, this is not very good. And what is worse – it can be disastrous, if you link to improper places – i.e. bad neighbors. The concept is hardly difficult to comprehend – it is so similar to real life: if you choose outlaws or bad guys for friends, you are considered to be one of them.

It might look unfair to be penalized for things that you have not done but linking to sites with bad reputation is equal to a crime for search engines and by linking to such a site, you can expect to be penalized as well. And yes, it is fair because search engines do penalize sites that use different tricks to manipulate search results. In a way, in order to guarantee the integrity of search results, search engines cannot afford to tolerate unethical practices.

However, search engines tend to be fair and do not punish you for things that are out of your control. If you have many inbound links from suspicious sites, this will not be regarded as a malpractice on your side because generally it is their Web master, not you, who has put all these links. So, inbound links, no matter where they come from, cannot harm you. But if in addition to inbound links, you have a considerable amount of outbound links to such sites, in a sense you vote for them. Search engines consider this as malpractice and you will get punished.

Why Do Some Sites Get Labelled as Bad Neighbors?

We have already mentioned in this article some of the practices that are a reason for search engines to ban particular sites. But the “sins” are not only limited to being a spam domain. Generally, companies get blacklisted because they try to boost their ranking by using illegal techniques such as keyword stuffing, duplicate content (or lack of any original content), hidden text and links, doorway pages, deceptive titles, machine-generated pages, copyright violators, etc. Search engines also tend to dislike meaningless link directories that conceive the impression that they are topically arranged, so if you have a fat links section on your site, double-check what you link to.

Figuring Out Who's Good, Who's Not

Probably the question that is popping is: “But since the Web is so vast and so constantly changing, how can I know who is good and who is bad?” Well, you don't have to know each of the sites on the black list, even if it were possible. The black list itself is changing all the time but it looks like there will always be companies and individuals who are eager to earn some cash by spamming, disseminating viruses and porn or simply performing fraudulent activities.

The first check you need to perform when you have doubts that some of the sites you are linking to are bed neighbors is to see if they are included in the indices of Google and the other search engines. Type “site:siteX.com”, where “siteX.com” is the site you are performing a check about and see if Google returns any results from it. If it does not return any results, chances are that this site is banned from Google and you should immediately remove any outbound links to siteX.com.

If you have outbound links to many different sites, such checks might take a lot of time. Fortunately, there are tools that can help you in performing this task. The CEO of Blackwood Productions has recommended http://www.bad-neighborhood.com/ as one of the reliable tools that reports links to and from suspicious sites and sites that are missing in Google's index.

Comments (0)
Optimizing Flash Sites - Tuesday, September 06, 2011

If there is a really hot potato that divides SEO experts and Web designers, this is Flash. Undoubtedly a great technology to include sounds and picture on a Web site, Flash movies are a real nightmare for SEO experts. The reason is pretty prosaic – search engines cannot index (or at least not easily) the contents inside a Flash file and unless you feed them with the text inside a Flash movie, you can simply count this text lost for boosting your rankings. Of course, there are workarounds but until search engines start indexing Flash movies as if they were plain text, these workarounds are just a clumsy way to optimize Flash sites, although certainly they are better than nothing.

Why Search Engines Dislike Flash Sites?

Search engines dislike Flash Web sites not because of their artistic qualities (or the lack of these) but because Flash movies are too complex for a spider to understand. Spiders cannot index a Flash movie directly, as they do with a plain page of text. Spiders index filenames (and you can find tons of these on the Web), but not the contents inside.

Flash movies come in a proprietary binary format (.swf) and spiders cannot read the insides of a Flash file, at least not without assistance. And even with assistance, do not count that spiders will crawl and index all your Flash content. And this is true for all search engines. There might be differences in how search engines weigh page relevancy but in their approach to Flash, at least for the time beings, search engines are really united – they hate it but they index portions of it.

What (Not) to Use Flash For?

Despite the fact that Flash movies are not spider favorites, there are cases when a Flash movie is worth the SEO efforts. But as a general rule, keep Flash movies at a minimum. In this case less is definitely better and search engines are not the only reason. First, Flash movies, especially banners and other kinds of advertisement, distract users and they generally tend to skip them. Second, Flash movies are fat. They consume a lot of bandwidth, and although dialup days are over for the majority of users, a 1 Mbit connection or better is still not the standard one.

Basically, designers should keep to the statement that Flash is good for enhancing a story, but not for telling it – i.e. you have some text with the main points of the story (and the keywords that you optimize for) and then you have the Flash movie to add further detail or just a visual representation of the story. In that connection, the greatest SEO sin is to have the whole site made in Flash! This is is simply unforgivable and do not even dream of high rankings!

Another “No” is to use Flash for navigation. This applies not only to the starting page, where once it was fashionable to splash a gorgeous Flash movie but external links as well. Although it is a more common mistake to use images and/or javascript for navigation, Flash banners and movies must not be used to lead users from one page to another. Text links are the only SEO approved way to build site navigation.

Workarounds for Optimizing Flash Sites

Although a workaround is not a solution, Flash sites still can be optimized. There are several approaches to this:

  • Input metadata
    This is a very important approach, although it is often underestimated and misunderstood. Although metadata is not as important to search engines as it used to be, Flash development tools allow easily to add metadata to your movies, so there is no excuse to leave the metadata fields empty.

  • Provide alternative pages
    For a good site it is a must to provide html only pages that do not force the user to watch the Flash movie. Preparing these pages requires more work but the reward is worth because not only users, but search engines as well will see the html only pages.

  • Flash Search Engine SDK
    This is the life-belt. The most advanced tool to extract text from a Flash movie. One of the handiest applications in the Flash Search Engine SDK is the tool named swf2html. As it name implies, this tool extracts text and links from a Macromedia Flash file and writes the output unto a standard HTML document, thus saving you the tedious job to do it manually.
    However, you still need to have a look at the extracted contents and correct it, if necessary. For example, the order in which the text and links is arranged might need a little restructuring in order to put the keyword-rich content in the title and headings or in the beginning of the page.
    Also, you need to check if there is no duplicate content among the extracted sentences and paragraphs. The font color of the extracted text is also another issue. If the font color of the extracted text is the same as the background color, you will run into hidden text territory.

  • SE-Flash.com
    Here is a tool that visually shows what from your Flash files is visible to search engines and what is not. This tool is very useful, even if you already have the Flash Search Engine SDK installed because it provides one more check of the accuracy of the extracted text. Besides, it is not certain that Google and the other search engines use Flash Search Engine SDK to get contents from a Flash file, so this tool might give completely different results from those that the SDK will produce.

These approaches are just some of the most important examples of how to optimize Flash sites. There are many other approaches as well. However, not all of them are brilliant and clear, or they can be classified on the boundary of ethical SEO – e.g. creating invisible layers of text that is delivered to spiders instead the Flash movie itself. Although this technique is not wrong – i.e. there is no duplicate or fake content, it is very similar to cloaking and doorway pages and it is better to avoid it.

Comments (0)
LiveZilla Live Help