ES EN

What is a sitemap. Best Practices creating one

Sitemap
If you want your website to be effective and rank high in Google, you need a Sitemap. A sitemap, as its name suggests, is a map of a website.

It is designed based on the distribution of content that is published, taking into account elements of the website if it has multiple pages, some videos or downloadable files, and the interrelationship between them.

It’s an important factor if you want to improve your page’s SEO.

If you need web design with the best SEO practices, we at Dazzet can help you.

Having a sitemap facilitates the reading performed by search engines like Google to determine content positioning, identify publication dates, whether they have periodic updates, or if they have different versions (for example, when the same content is included in multiple languages).

In the case of images within a sitemap, it can include information about the subject, the type of image, and its publication and distribution license.

As for videos, a sitemap can contain data about duration, the category they fall under, and if they have an age rating.

When is it important to have a sitemap

As we have mentioned before, having a site map helps make Google’s crawling process for indexing pages much more effective.

However, Google itself tells us that having one does not guarantee that all elements of the map will be crawled and indexed, due to the complexity of the algorithms used by this search engine.

But in most cases, pages with sitemaps are benefited in some way, and there are no penalties from Google for having them.

In these cases, designing a sitemap will allow Google to crawl the content hosted on a web page much more easily, so you can manually insert a sitemap with external platforms that provide this service.

Now, the recommendations for integrating site maps are based on the characteristics and architecture of the web page, as well as the content that is constantly uploaded to it…

It is advisable to have and update it when the website has any of the following characteristics:

  • It is very extensive: contains multiple pages or constant updates.
  • If the pages contained on the website are not connected to each other, making them complex to be automatically crawled by search engines.
  • It has a large amount of “rich media” (videos, images).
  • It is a new site or has few external links mentioning it.

Types of Sitemaps

Some of the formats include XML, RSS, MRSS, Atom, and text (manually inserting website links).

The most important are in XML and HTML formats.

XML

This format provides search engines with an efficient list of all the URLs that make up the website when it has multiple pages (such as product pages, catalogs, blogs, different services, etc.)

XML sitemaps are just text files marked with tags identifying data types.

The URL of an XML sitemap is usually at the root of a domain, for example, www.yourbusiness.com/sitemap.xml, ready for crawler bots to access more easily.

They are divided into sitemaps of:

  • Image Sitemaps
  • Video Sitemaps
  • News Sitemaps
  • Mobile Sitemaps

When a bot visits a site, it first accesses the robots.txt file, which is a list of instructions, including URLs to crawl or ignore. The robots.txt file should reference the XML sitemap, which in turn sends the bot to crawl the list of URLs.

XML sitemaps follow precise markup rules. Once created, the XML sitemap is automatically generated, ideally, without your intervention.

But it is always advisable to regularly check for errors, as outdated, inaccurate, and duplicate URLs frequently infiltrate.

XML sitemaps have limits, including:

  • No guarantee of indexing: according to Google and other search engines, XML maps simply suggest URLs you want search engines to crawl and index, but they may not necessarily crawl all of them or index them even if crawled.
    – No link authority passed: unlike HTML links, URLs in XML site maps do not transfer link authority, so if only found in the XML sitemap, it is unlikely that search engines will rank a URL.

HTML

Unlike sitemaps in XML format, HTML sitemaps are commonly seen by website visitors when they find a menu embedded in the content.

This makes it easier for people to access from the Home or another page of the site to links of interest of the same page, for example, if you sell many products of different categories, people will be able to find in this sitemap the categories to which they can access directly.

That is why websites containing hundreds of embedded pages prefer to integrate HTML sitemaps to improve user experience.

HTML sitemaps usually have limited SEO value because not all of them are being crawled by bots or search algorithms.

The HTML format is becoming less used due to the type of design being developed by new websites, which use scrolling navigation on the home page because they provide a deep knowledge of the contents that can be found on the page.

Previously, HTML sitemaps facilitated direct access to multiple pages and increased SEO rankings.

Today, many HTML sitemaps simply replicate links that are already available in the header or footer. Some sites still use HTML sitemaps for main navigation.

For SEO, the HTML format can offer you some limited benefits when:

  • The main site navigation (Home) does not link to all pages.
  • Navigation or a section of the site is inaccessible to search engines.
  • When there is a notable amount of product or service pages and subpages that are not easy to find with organic navigation within the site.
  • Analytics data show that visitors are using the HTML sitemap instead of generating other planned actions for visitors, if so, we recommend investigating with heatmaps what happens that people do not directly access certain pages.

In general, there are no penalties or harm to SEO in having an HTML sitemap because it is another useful form of internal linking, but it is not necessary to give it much priority since if you are looking to improve organic growth and conversions, you should always have the main site navigation indexed.

Software Available for Creating Sitemaps

Currently, there are multiple platforms offering sitemap generation that can be integrated into your website.

If your site is created on WordPress, you can use:

  • Yoast: This is a popular tool for SEO optimization, which also includes sitemap generation.
  • XML Sitemaps: This can be created for sites not hosted on WordPress.
  • Google Sitemaps: This tool is specifically designed to create sitemaps that are compatible with Google’s guidelines.

Best Practices for Sitemaps

Create a Site Map

Your first step is to create a site map.

If you use WordPress, you can obtain a site map with the Yoast SEO plugin. The main benefit of using Yoast to create your XML sitemap is that it updates automatically (it has a dynamic site map).

So, every time you add a new page to your site (whether it’s a blog post or an e-commerce product page), a link to that page will automatically be added to your site map file.

If you don’t use Yoast, there are many other plugins available for WordPress (like Google XML Sitemaps or Rank Math) that you can use to create a site map.

What if you don’t use WordPress? Don’t worry. You can use a third-party site map generator tool like XML-Sitemaps.com. These tools will generate an XML file that you can use as your site map.

Either way, once your site map has been created, it is advisable to manually review it.

Submitting the Site Map to Google

To submit your sitemap, log in to your Google Search Console account.

Then, go to “Index” → “Sitemaps” in the sidebar.

If you have already submitted your sitemap, you will see a list of “Submitted Sitemaps” on this page:

Either way, to submit your sitemap, enter the URL of your sitemap in this field:

And click “Submit”.

add new Sitemap

Use the Sitemap to Identify Indexing Issues

One of the interesting things about using a sitemap is that it can give you an approximate estimation of:

  • How many pages you WANT to index.
  • How many pages ARE indexed.

For example, suppose your sitemap links to 5,000 pages. But when you look at Google Search Console, your site only has 2,000 pages indexed.

That’s a sign that something is off. It could be that there’s a lot of duplicate content in those 5,000 pages, so Google isn’t indexing all of them.

Or it could be that the number of pages on your site exceeds your crawl budget.

Ensure Your Sitemaps and Robots.txt Match

It’s important that your sitemaps and robots.txt work together. In other words:

If you block a page in robots.txt or use the “noindex” tag on a page, you WON’T want it to appear in your sitemap. Otherwise, you’re sending conflicting messages to Google. Your sitemap says, “This page is important enough to be in our sitemap.” But when Googlebot gets to the page, it’s blocked.

Juan Esteban Yepes

Talk to one of our experts

Contact us