![]() |
|||
|
Some questions and answers we have received in the past few weeks. If you have any questions you can send them to customerservice@elixirsystems.com. Dylan
Q - I have a new website. I was checking daily to see if it was indexed by yahoo and google. At one time yahoo had 76 pages cached from my site. Now it is just one. Can you suggest to me why this would happen. A - (Upon checking the two site URLs supplied they were identical) The search engines do not like duplicate content as it tends to clog up the search results returned and no one wants to see the same results multiple times in the search results. This is why one copy has been suppressed by the search engines. I would suggest picking one site and sticking to that name (preferably the site that is currently fully indexed). On the second site set up a 301 redirect. The method of implementing a 301 redirect is different depending on which web server you are hosted on. In this instance it is Apache and you can use a .htaccess file to send visitors to the main site. A .htaccess file is a plain text file, create a blank file and name it (dot)htaccess. Your .htaccess file would look something like: Options +FollowSymLinks This should be put in the top most (root) directory of your old site (usually the same place as your home page). By the way, if the two sites are parked at the same DNS which they appear to be then the code above only redirects the old site traffic, the new site traffic will not be affected Q - My boss has the idea that if we buy a bunch of (5-10) franchise domains, build sites with 1 page of franchise content on each of them, and link these sites to the american site it will help ranking. I am not so sure. What do you think it could do to help, in the short and long term? A - This idea used to be a way of building up the “mother” site. You could create satellite sites and have them all point to the original site and that would help the original site. However, it is frowned upon by Google. These sites are not built for the visitor, but with the purpose of manipulating the rankings. The aging filter or sandbox that Google currently uses is partly in response to this technique. To have this be really effective, the satellite sites need to have substantial original content, have page rank (or incoming links) of their own, have out bound links to other sites and be in existence for at least a year. If the only significant out bound links are to the original site, they are easy to spot and the links are ignored. If they have no incoming links, their outbound links are devalued and they don’t contribute to the original site anyway. If they are on the same hosting company or have the same “who is” information, the original site can be penalized for the use of this technique. If the content is substantially copied from the original site, then all sites are penalized for duplicate content. You can do this, but know that it has high risks, high amount of investment and little reward in the short term. Because of the better detection and filters employed by Google, we don’t recommend this. We think the risks will continue to mount and that your time, effort and money are best spent on your original site. Google webmaster guidelines can be found here http://www.google.com/webmasters/guidelines.html . Google tends to have the stricter and better enforced rules. However, the other search engines also know that the only real value they have is the quality of results returned to the searchers. They are quickly closing the gap and also getting better at detecting and penalizing sites. That’s part of what I mean by increasing risks. Q - In your SEO course you mention removing the robots.txt file in case it is blocking the spiders from accessing your website. Can you provide more details? A - Robots.txt is a minefield, it can catch even the most experienced people. A robots.txt file in and of itself is not an issue, but if it's set up incorrectly it can cause problems. One common issue is missing the '/' off for blocked directories. The search engines by default treat the disallow line as a URL stem and append a '*' to the end. So for example using the entries:
Meaning to block the /dev directory will also block access to /develop.htm, /deviled-eggs.php, etc. The correct wording should be:
If you have no robots.txt file created then this can also block the search engine spiders. Before crawling your site, well behaved spiders always request the robots.txt file, if this file is missing your Website may return a 404 (page not found), a 404 error page, or in some cases an error 500 (internal error). If the spiders receive an error code 500 they may assume your site is broken and not index your site. |
|||
This is the newsletter from Elixir Systems. Elixir Systems is a Phoenix, Arizona based search engine optimization and search engine marketing services company. If you received a copy of this newsletter and would like to subscribe go to http://www.elixirsystems.com/newsletter. You are currently subscribed as %BASIC:EMAIL% © Elixir Systems, All rights reserved. You are f~ree to use this article in whole or in part, as long as you include complete attribution, including live web site link and email link. Please also notify us where the material will appear using the address |
|||