Khamis, Mei 16, 2013

located at http://www.example.com/sitemap.xml, it can't include URLs from http://subdomain.example.com.
URLs that are not considered valid are dropped from further consideration. It is strongly recommended that you place your Sitemap at the root directory of your web server. For example, if your web server is at example.com, then your Sitemap index file would be at http://example.com/sitemap.xml. In certain cases, you may need to produce different Sitemaps for different paths (e.g., if security permissions in your organization compartmentalize write access to different directories).
If you submit a Sitemap using a path with a port number, you must include that port number as part of the path in each URL listed in the Sitemap file. For instance, if your Sitemap is located at http://www.example.com:100/sitemap.xml, then each URL listed in the Sitemap must begin with http://www.example.com:100.

Sitemaps & Cross Submits

To submit Sitemaps for multiple hosts from a single host, you need to "prove" ownership of the host(s) for which URLs are being submitted in a Sitemap. Here's an example. Let's say that you want to submit Sitemaps for 3 hosts:
www.host1.com with Sitemap file sitemap-host1.xml

www.host2.com with Sitemap file sitemap-host2.xml

www.host3.com with Sitemap file sitemap-host3.xml

Moreover, you want to place all three Sitemaps on a single host: www.sitemaphost.com. So the Sitemap URLs will be:



By default, this will result in a "cross submission" error since you are trying to submit URLs for www.host1.com through a Sitemap that is hosted on www.sitemaphost.com (and same for the other two hosts). One way to avoid the error is to prove that you own (i.e. have the authority to modify files) www.host1.com. You can do this by modifying the robots.txt file on www.host1.com to point to the Sitemap on www.sitemaphost.com.
In this example, the robots.txt file at http://www.host1.com/robots.txt would contain the line "Sitemap: http://www.sitemaphost.com/sitemap-host1.xml". By modifying the robots.txt file on www.host1.com and having it point to the Sitemap on www.sitemaphost.com, you have implicitly proven that you own www.host1.com. In other words, whoever controls the robots.txt file on www.host1.com trusts the Sitemap at http://www.sitemaphost.com/sitemap-host1.xml to contain URLs for www.host1.com. The same process can be repeated for the other two hosts.
Now you can submit the Sitemaps on www.sitemaphost.com.
When a particular host's robots.txt, say http://www.host1.com/robots.txt, points to a Sitemap or a Sitemap index on another host; it is expected that for each of the target Sitemaps, such as http://www.sitemaphost.com/sitemap-host1.xml, all the URLs belong to the host pointing to it. This is because, as noted earlier, a Sitemap is expected to have URLs from a single host only.

Validating your Sitemap

The following XML schemas define the elements and attributes that can appear in your Sitemap file. You can download this schema from the links below:
For Sitemaps: http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd
For Sitemap index files: http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd
There are a number of tools available to help you validate the structure of your Sitemap based on this schema. You can find a list of XML-related tools at each of the following locations:

In order to validate your Sitemap or Sitemap index file against a schema, the XML file will need additional headers as shown below.
<?xml version='1.0' encoding='UTF-8'?>

<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

         xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"






Sitemap index file:
<?xml version='1.0' encoding='UTF-8'?>

<sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

         xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd"






Extending the Sitemaps protocol

You can extend the Sitemaps protocol using your own namespace. Simply specify this namespace in the root element. For example:
<?xml version='1.0' encoding='UTF-8'?>

<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

         xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"


         xmlns:example="http://www.example.com/schemas/example_schema"> <!-- namespace extension -->








Informing search engine crawlers

Once you have created the Sitemap file and placed it on your webserver, you need to inform the search engines that support this protocol of its location. You can do this by:
The search engines can then retrieve your Sitemap and make the URLs available to their crawlers.

Submitting your Sitemap via the search engine's submission interface

To submit your Sitemap directly to a search engine, which will enable you to receive status information and any processing errors, refer to each search engine's documentation.

Specifying the Sitemap location in your robots.txt file

You can specify the location of the Sitemap using a robots.txt file. To do this, simply add the following line including the full URL to the sitemap:
Sitemap: http://www.example.com/sitemap.xml
This directive is independent of the user-agent line, so it doesn't matter where you place it in your file. If you have a Sitemap index file, you can include the location of just that file. You don't need to list each individual Sitemap listed in the index file.
You can specify more than one Sitemap file per robots.txt file.
Sitemap: http://www.example.com/sitemap-host1.xml

Sitemap: http://www.example.com/sitemap-host2.xml

Submitting your Sitemap via an HTTP request

To submit your Sitemap using an HTTP request (replace <searchengine_URL> with the URL provided by the search engine), issue your request to the following URL:
For example, if your Sitemap is located at http://www.example.com/sitemap.gz, your URL will become:
URL encode everything after the /ping?sitemap=:
You can issue the HTTP request using wget, curl, or another mechanism of your choosing. A successful request will return an HTTP 200 response code; if you receive a different response, you should resubmit your request. The HTTP 200 response code only indicates that the search engine has received your Sitemap, not that the Sitemap itself or the URLs contained in it were valid. An easy way to do this is to set up an automated job to generate and submit Sitemaps on a regular basis.
Note: If you are providing a Sitemap index file, you only need to issue one HTTP request that includes the location of the Sitemap index file; you do not need to issue individual requests for each Sitemap listed in the index.

Excluding content

The Sitemaps protocol enables you to let search engines know what content you would like indexed. To tell search engines the content you don't want indexed, use a robots.txt file or robots meta tag. See robotstxt.org for more information on how to exclude content from search engines.

5 ulasan:

  1. Looking for the Best Dating Site? Join to find your perfect date.

  2. Submit your website or blog now for appearing in Google and over 300 other search engines!

    Over 200,000 sites submitted!


  3. Accelerate Your Business! Advertise on the most popular PTC & TE websites with a single purchase. TrafficHeap.

  4. BlueHost is one of the best web-hosting company with plans for all of your hosting needs.

  5. If you need your ex-girlfriend or ex-boyfriend to come crawling back to you on their knees (no matter why you broke up) you got to watch this video
    right away...

    (VIDEO) Want your ex CRAWLING back to you...?


Popular Project

Another Project Blog

Project From Diff Blogger


How to Find the other Websites of a Person?


Whois Source Logo

BetterWhois.com: Search ALL Domain Registrars

Webcams.travel - The Webcam Community - Home