⚡️ Was sitemap.xml does and does not
Overview of sitemap.xml
A sitemap.xml
is an XML file that contains a list of URLs (not domain, but URL) on a website. This file is used to inform search engines such as Google, Bing and others about the structure of the website and the pages available. It is functionally related to the robots.txt and helps search engine crawlers efficiently find and index all the important pages of the website, especially those that may not be discovered through normal navigation.
Important features of sitemap.xml
- XML format: The sitemap is in XML format and contains specific tags describing the URLs and additional information. XML and HTML are related and can be easily converted from one format to the other.
- Location and accessibility: The file is usually stored in the root directory of the website, e.g.
www.yourwebsite.com/sitemap.xml
, and should be referenced in therobots.txt
file. - Additional information: In addition to the URLs, the sitemap can contain additional information such as the last modification date, the frequency of changes and the priority of the pages.
- Atomatizability: Sitemaps can be generated automatically. Many content management systems already offer these features.
Example of a sitemap.xml
Here is a simple example of a sitemap.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.yourwebsite.com/</loc>
<lastmod>2024-05-30</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.yourwebsite.com/page1.html</loc>
<lastmod>2024-05-25</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://www.yourwebsite.com/contact.html</loc>
<lastmod>2024-05-20</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
</urlset>
Elements of the sitemap.xml
<urlset>
: The root element that encloses all URLs in the sitemap.<url>
: A container for each URL on the website.<loc>
: The URL of the page.<lastmod>
: (Optional) The date the page was last modified, inYYYY-MM-DD
format.<changefreq>
: (Optional) The estimated frequency with which the page’s content changes (e.g.always
,hourly
,daily
,weekly
,monthly
,yearly
,never
).<priority>
: (Optional) The priority of the page relative to other pages on the site, on a scale of 0.0 to 1.0.
Benefits of a sitemap.xml
- Better indexing: Search engines can more easily find and index all the important pages on your site, improving visibility in search results.
- New page detection: New or recently updated pages are detected more quickly.
- Prioritization: You can tell search engines which pages are most important to you.
- Complex sites: Especially useful for large sites with many pages or for sites with complex structures that are difficult for normal crawlers to navigate.
Submission to search engines
-
Google Search Console: Upload the sitemap to Google Search Console to make Google aware of your website structure directly.
-
Bing Webmaster Tools: You can also submit your sitemap to Bing to help with indexing.
-
robots.txt
: Add a reference to the sitemap in therobots.txt
file to make crawlers aware of it.
Sitemap: http://www.yourwebsite.com/sitemap.xml
Conclusion
The sitemap.xml
is a powerful tool to help search engines efficiently find and index all relevant pages on your website. It is particularly useful for large and complex websites, as well as for pages with dynamic content, such as JavaScript snippets in the markup. By providing additional information such as change frequency and priority, the sitemap supports more effective and faster indexing, which can lead to better visibility in search results.