Sitemap for web crawlers

SiteMap

What is a Sitemap ?

A site map (or sitemap) is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for Web design, or a Web page that lists the pages on a Web site, typically organized in hierarchical fashion.

Why would you need a sitemap for your web site is usually a reasonable question for any web designer.  A sitemap is a quick access point that allows the visitors on your web site to see the hierarchy of your website and pages in a single glance.   Your web designer tool kits usually provide a method to generate a sitemap from the current set of available pages which will be constantly deleting, editing and changing product, information, static pages.

Let’s try to understand what a sitemap is

A sitemap is a file where you can list the web pages of your site to tell Google and other search engines about the organization of your site content. Search engine web crawlers like Googlebot read this file to more intelligently crawl your site.

Also, your sitemap can provide valuable metadata associated with the pages you list in that sitemap: Metadata is information about a webpage, such as when the page was last updated, how often the page is changed, and the importance of the page relative to other URLs in the site.

So basically the additional information will assist in making your web site a possible hit when folks are searching the internet for particular services or products.

Why do you need a sitemap?

If your site’s pages are properly linked, Google’s web crawlers can usually discover most of your site. Even so, a sitemap can improve the crawling of your site, particularly if your site meets one of the following criteria:

  • Your site is really large. As a result, it’s more likely Google web crawlers might overlook crawling some of your new or recently updated pages.
  • Your site has a large archive of content pages that are isolated or well not linked to each other. If you site pages do not naturally reference each other, you can list them in a sitemap to ensure that Google does not overlook some of your pages.
  • Your site is new and has few external links to it. Googlebot and other web crawlers crawl the web by following links from one page to another. As a result, Google might not discover your pages if no other sites link to them.
  • Your site uses rich media content, is shown in Google News, or uses other sitemaps-compatible annotations. Google can take additional information from sitemaps into account for search, where appropriate.

Here are a few useful tips for sitemap

Updating your sitemap after any revision is performed is a excellent benefit.  A HTML sitemap can easily be considered as a layout of all the pages that your web site has to offer while a XML sitemap specifically aims for search engines as it reflects the most up-to-date information about the last updated pages.  Depending on the web site design you can have plug-ins or the tools auto update / create the sitemap.

Your sitemap can provide valuable metadata associated with the pages you list in that sitemap: Metadata is information about a webpage, such as when the page was last updated, how often the page is changed, and the importance of the page relative to other URLs in the site. (Source: Google support)

Once you have your sitemap ready, you should submit is to search engines like Google, Bing and Yahoo search via their respective webmaster tools. Click here for steps to submit a sitemap to Google Webmaster Tools.

Mozilla Firefox v35.0 still has HUGE memory issues on a Windows machine which creates a resource drain.

The recommendation for users to not to use the latest version of Firefox until they resolve the issues on memory which at times can exceed 2 gigabytes and could continue to grow during any session regardless of open tabbed windows. In the lab the decision was made just to start up Firefox and have the latest updates applied which at this stage is 35.0.1 and open any single browser session to any favorite web site.

To conduct the similar test all you need to do is:

  • Start a firefox session and open a single browser session to any web site.
  • On a Windows 7 machine, click on start and enter taskmgr in the search text box.  This will start up a task manager session.  Task Manager provides the end user with simple measures to review what is happening on the memory manager, processes and services for a windows session.  Go ahead and click on the processes tab.  Finally click on the memory column to sort by the highest usage.
  • On a Windows 8 machine,   right-click on the Taskbar and select Task Manager from the popup window.  Do the similar steps described above.
  • Watch as Firefox continues to eat memory by doing absolutely nothing, you do not even have to move around the web site you chose.

 

When confronting Firefox, they listed between 10 to 12 options to review what is happening in the browser.  Naturally the first thing is to always get the latest version, next is to update any plug-ins and finally, believe it or not, add memory.  This is all nonsense, there are serious issues with Firefox that are not being taken as a high priority and this will come back to haunt them if it is not taken care of in the near future.

Meantime, Baron Software recommends to use Google Chrome which is by far the less resource hog on any windows machine.  Internet Explorer (IE) is sometimes very buggy and now Firefox which was a favor is being pushed to the bottom since the resource drain hampers any type of work session.