Are you struggling to find your web pages through online searches? In this post, you will learn how to check if your pages are indexed.
If you’re not ranking for anything you try, no matter how specific, it may be an indexing problem.
Every search engine draws its results from an index, and if a web page is missing from that index, it obviously can’t appear in the results. Because of this, an indexing problem can end up totally wasting all your high-quality content and on-page optimization work.
If you’re not sure whether your web pages are properly indexed, or you need some advice for resolving an indexing problem, this is the article for you. We’re going to look at how indexing works, what the most common indexing problems are, and how you can make the necessary changes to ensure that you don’t suffer from indexing problems again.
How indexing works
When new websites or pages are hosted online, they don’t announce themselves to search engines — the engines have to put in the work to find them. The way they do this is through search engine bots (otherwise referred to as crawlers). Crawlers, as the name suggests, make their way through websites, following internal and external links with the goal of indexing and organizing all the content they find.
All the data collected along the way is stored in the index, ready for use by the search engine. When a user submits a search query, the search engine does the following:
- Parse the query to best judge searcher intent.
- Filter the index in accordance with the inferred intent.
- Select all the pages deemed suitable (taking a large number of factors into account).
- Present them to the user in order of relevance.
Because pages are regularly updated, rising or falling in quality and relevance to particular topics, crawlers must return to indexed sites on a regular basis. How regularly a site is crawled will depend on how frequently it changes, how much authority it is considered to have, and numerous other metrics.
Why some pages shouldn’t be indexed
Search indexes aren’t just databases of everything found by crawlers.
There are three reasons why it would be a bad idea to list absolutely every live link in a search index:
- The purpose of a search index is to store pages relevant to the searcher’s intent, and certain pages (and page types) do not contain any such content and are not worth including. For instance, websites with product filters can often end up automatically generating lengthy lists of different URLs for filtered or sorted views, and many of those views will not be of interest to anyone. Additionally, if multiple pages have the same content, only one of them should be returned: searchers don’t benefit from several results with the same content.
- Just as a page must be suitable from a content standpoint, it must also be up to level from a technical standpoint. If a user clicks a search result link and the page never loads, opens up spam pop-ups, or provides an unacceptable user experience, then it reflects poorly on the search engine and discourages the user.
- Search engines want website owners to self-curate since there’s no benefit in including pages that the owners don’t want to be indexed. Sometimes website owners want to keep content around for posterity but choose to archive it because it is outdated and/or not fit for purpose. Search results should be kept to the most worthwhile pages, and those are the pages that website owners actively want to be seen.
As a result, there are various things that can lead to a page not being indexed by a search provider. Not only can the crawler decide that a page isn’t worth including, but the website owner can signify that a page should not be indexed, and even tag specific links to tell crawlers not to follow them in the first place.
How to see if a page has been indexed
To see if your pages have been indexed by Google, use the Google Index Checker tool. That’s a great tool to see exactly which of your pages are not indexed and may require tweaking or improvement.
Common reasons why a page isn’t indexed
As noted, there are numerous reasons why a page doesn’t end up being indexed. Here are the most common:
- Crawlers can’t find it. If your website doesn’t have a comprehensive XML sitemap (a list of all pages to be indexed) or internal links to a particular page or it’s buried deep inside folders, often, a crawler won’t be able to find it, and thus will be unable to index it.
- The page is set to ‘noindex’. Even if you have an internal link to a page, or an external link pointing to it from another domain, the page might have been tagged as ‘noindex’, leading to crawlers ignoring it.
- It’s blocked in robots.txt. Every web server provides a file called robot.txt which contains instructions for crawlers. If a robots.txt file is forbidding all crawlers from indexing the site, that’s obviously an enormous problem. Though that is uncommon, it’s not all that rare for a developer to try to block specific pages but accidentally end up blocking far more in the process.
- The quality level isn’t high enough. If your domain features low-value links and not enough content (or no content at all), search engines may decide not to rank some or all of your pages in an effort to maintain a high standard.
There are various other possible causes for pages not being indexed, but they can be quite technically complex and depend on the exact nature of your site. In the vast majority of cases, the explanation will be one of those listed above.
The consequences of indexing issues
How much a website is affected by indexing issues depends on the nature and breadth of the issues as well as which pages they affect. If a minor page on your site goes unindexed, it isn’t the end of the world, but if a high-quality SEO-friendly piece of evergreen content isn’t indexed, that’s a sizeable waste of effort.
And for websites in the e-commerce sector, indexing is even more significant. Organic traffic is by far the most cost-effective form of traffic for product pages because it doesn’t cost anything to appear in search results, unlike PPC or social media advertising. If half a company’s products aren’t indexed, the conversion opportunities are hugely diminished.
Knowing when a page will be indexed
It’s very common for website owners to pose one particular question: when will my page be indexed? Unfortunately, there’s no way of knowing conclusively. Even if you do absolutely everything correctly, strictly following the guidelines provided by Google and other search providers, it will depend on factors outside of your control.
Because search indexes encompass millions upon millions of pages from all over the world, and must consistently refresh their crawls to ensure up-to-date information, your page could be indexed tomorrow, next week, or in a couple of months.
Sustainability is only possible through operational efficiency, and there’s no efficiency or value in making an effort to index absolutely everything as a matter of urgency. This is why Google talks about having a crawl budget that determines how frequently a page is crawled.
How to review your web setup for indexing
To ensure that indexed pages remain properly indexed, new pages get added to that list, and unwanted pages are not included, it’s important for any company with a large web presence to commit time and resources to this specific SEO issue.
On a somewhat regular basis (perhaps every three months or so), you should carry out a thorough review of the following things:
- The information architecture of your website. Is everything functioning as it should from a technical standpoint? Is the server handling the load as it should?
- Your internal linking structure. Do you have sufficient internal links to support regular crawling? You can be fairly liberal with internal links as they provide useful context. Use this free link opportunities tool to check if you are missing any relevant internal links.
- Your breadcrumb setup. Are pages nested correctly and set in the right categories? Maintaining a logical internal structure is very important for showing search engines that your pages are worth indexing.
While this won’t require you to become an IT expert, you will need to either get to grips with all of these concepts or consult with someone who knows exactly how to check these things and make any required changes.
Where indexing fits in your SEO strategy
When companies think about how SEO plays into their marketing strategies, they tend to view technical SEO considerations as low-priority concerns. Other SEO considerations such as content marketing or social media outreach are more creative and thus viewed as more glamorous and interesting.
The problem with this line of thinking, of course, is that overlooking the technical fundamentals is extremely foolish. If your budget for a lengthy campaign of paid advertising, social media work, content production, and brand advocacy, but fail to realize that you’re building authority around a page that cannot be indexed in Google, it will amount to a futile investment — as soon as the campaign ends, your traffic will all but disappear once again.
Using indexes for competitor research
Leaving aside your own indexing, there’s another aspect of search indexes that warrants a mention: competitor research. By looking at which pages your competitor’s index (and which ones they don’t), you can get an idea of what they’re doing and have the opportunity to reverse engineer their strategies.
Just think about how much information is readily available to you at no cost through simple Google searches. If you invest a little time in reviewing how other companies in your industry handle indexing, backlinks, and search results in general, you’ll get a lot out of it.
Wrapping up how to check if your page is indexed
Building a website into a highly competitive online presence is challenging at the best of times, no matter how much great content you have or how well you’re engaging with your audience. It takes time and consistency, and there are plenty of other excellent sites out there worth being ranked above you if you let your standards slip.
Because you must invest so much time and effort into content, outreach, and UX development, it’s absolutely paramount that you ensure you aren’t held back by a fundamental technical issue like having important pages not indexed.
We’ve covered what indexing is, why it matters so much, and how you can take action to identify and overcome common indexing problems. The rest is up to you. Find the time to review your setup thoroughly, and schedule semi-frequent reviews to make sure that your efforts aren’t undermined by avoidable technical problems.
Kayleigh Toyra: Content Strategist
Half-Finnish, half-British marketer based in Bristol. I love to write and explore themes like storytelling and customer experience marketing. I manage a small team of writers at a boutique agency.
A good list and points on how to check if the page has been indexed or not
I am glad you found the article useful. I would love if you could share your experience with dealing with indexation issues?
extremely useful article!
Thank you for sharing great information! Well, sometimes I face some issues with Google, they don’t index my pages I don’t know why. Is there any way to check why Google avoid some pages for the index.
The main issues why Google might not index your page are outlined in this post but if you would like – you can share your website and I can have a look and let you know which could be the things you could improve. Let me know if you are interested 😉
useful article. but Google also index archive pages, tag pages. Are these pages should be indexed or not?
That’s up to you. Google indexes everything you want it to index. If you forbid in robots.txt file or header of your page not to index something – Google will stay away and not crawl the page.