If you’re not ranking for anything you try, no matter how specific, it might not have anything to do with the quality of your content — it might be an indexing problem.
Every search engine draws its results from an index, and if a web page is missing from that index, it obviously can’t appear in the results. Because of this, an indexing problem can end up totally wasting all your high-quality content and on-page optimization work.
If you’re not sure whether your web page is being indexed properly, or you need some advice for resolving an indexing problem, this is the article for you. We’re going to look at how indexing works, what the most common indexing problems are, and how you can make the necessary changes to ensure that you don’t suffer from indexing problems again.
How indexing works
When new websites or pages are hosted online, they don’t announce themselves to search engines — the engines have to put in the work to find them. The way they do this is through search engine bots (otherwise referred to as crawlers). Crawlers, as the name suggests, make their way through websites, following internal and external links with the goal of indexing and organizing all the content they find.
All the data collected along the way is passed back to the index, ready for use by the search engine. When a user submits a search query, the search engine does the following:
- Parse the query to best judge searcher intent.
- Filter the index in accordance with the inferred intent.
- Select all the pages deemed suitable (taking a large number of factors into account).
- Present them to the user in order of relevance.
Because pages are regularly updated, rising or falling in quality and relevance to particular topics, crawlers must return to indexed sites on a regular basis. How regularly a site is crawled will depend on how frequently it changes, how much authority it is considered to have, and numerous other metrics.
Why some pages shouldn’t be indexed
Search indexes aren’t just databases of everything found by crawlers.
There are three reasons why it would be a bad idea to list absolutely every live link in a search index:
- The purpose of a search index is to store links relevant to searcher intent, and certain pages (and page types) do not contain any such content and are not worth including. For instance, websites with product filters can often end up automatically generating lengthy lists of different URLs for filtered or sorted views, and many of those views will not be of no interest to anyone. Additionally, if multiple pages have the same content, only one of them should be given a link: searchers don’t benefit from several links to the same content.
- Just as a page must be suitable from a content standpoint, it must also be up to scratch on a technical level. If a user clicks a search result link and the page never loads, or opens up spam pop-ups, or provides an unacceptable user experience, then it reflects poorly on the search engine and discourages the user.
- Search engines want website owners to self-curate since there’s no benefit in including pages that the owners don’t want indexed. Sometimes website owners want to keep content around for posterity but choose to archive it because it is outdated and/or not fit for purpose. Search results should be kept to the most worthwhile pages, and those are the pages that website owners actively want to be seen.
As a result, there are various things that can lead to a page not being indexed by a search provider. Not only can the crawler decide that a page isn’t worth including, but the website owner can signify that a page should not be indexed, and even tag specific page links as unsuitable for indexing to tell crawlers not to bother following them in the first place.
How to see if something has been indexed
Thankfully, it’s fairly simple to see if a web page has been indexed, even though you won’t get any further information at this stage. Simply copy the exact page URL and paste it into the search bar as your query. If you get no results, then it has clearly not been indexed.
You can also check your copy on a more granular level by selecting a piece of unique content from the page and searching for it (bracket it with quote marks to ensure that you don’t see similar but not identical content). In the event that a page has been indexed but a segment of its copy isn’t appearing in results, you’ll know that there is a major on-page issue to be investigated.
If you want to check an entire site, you can do so by searching using the following format: site:yourdomainurl. This string will check that domain and return every indexed page, which can be useful if you’re dealing with a large site containing a lot of pages.
Common reasons why a page isn’t indexed
As noted, there are numerous reasons why a page doesn’t end up being indexed. Here are the most common:
- Crawlers can’t find it. If your website doesn’t have a comprehensive XML sitemap (a list of all pages to be indexed) or internal links to a particular page or it’s buried deep inside folders, often, a crawler won’t be able to find it, and thus will be unable to index it.
- The page is set to ‘noindex’. Even if you have an internal link to a page, or an external link pointing to it from another domain, the page might have been tagged as ‘noindex’, leading to crawlers ignoring it.
- It’s blocked in robots.txt. Every web server provides a file called txt which contains instructions for crawlers. If a robots.txt file is forbidding all crawlers from indexing the site, that’s obviously an enormous problem. Though that is uncommon, it’s not all that rare for a developer to try to block specific pages but accidentally end up blocking far more in the process.
- The quality level isn’t high enough. If your domain features low-value links and not enough content (or no content at all), search engines may decide not to rank some or all of your pages in an effort to maintain a high standard.
There are various other possible causes for pages not being indexed, but they can be quite technically complex and depend on the exact nature of your site. In the vast majority of cases, the explanation will be one of those listed above.
The consequences of indexing issues
How much a website is affected by indexing issues depends on the nature and breadth of the issues as well as which pages they affect. If a minor page on your site goes unindexed, it isn’t the end of the world, but if a high-quality SEO-friendly piece of evergreen content isn’t indexed, that’s a sizeable waste of effort.
And for websites in the e-commerce sector, indexing is even more significant. Organic traffic is by far the most cost-effective form of traffic for product pages because it doesn’t cost anything to appear in search results, unlike PPC or social media advertising. If half a company’s products aren’t indexed, the conversion opportunities are hugely diminished.
Knowing when a page will be indexed
It’s very common for website owners to pose one particular question: when will my page be indexed? Unfortunately, there’s no way of knowing conclusively. Even if you do absolutely everything correctly, strictly following the guidelines provided by Google and other search providers, it will depend on factors outside of your control.
Because search indexes encompass millions upon millions of pages from all over the world, and must consistently refresh their crawls to ensure up-to-date information, your page could be indexed tomorrow, next week, or in a couple of months.
Sustainability is only possible through operational efficiency, and there’s no efficiency or value in making an effort to index absolutely everything as a matter of urgency. This is why Google talks about having a crawl budget that determines how frequently a page is crawled.
How to review your web setup for indexing
To ensure that indexed pages remain properly indexed, new pages get added to that list, and unwanted pages are not included, it’s important for any company with a large web presence to commit time and resources to this specific SEO issue.
On a somewhat regular basis (perhaps every three months or so), you should carry out a thorough review of the following things:
- The information architecture of your website. Is everything functioning as it should from a technical standpoint? Is the server handling the load as it should?
- Your internal linking structure. Do you have sufficient internal links to support regular crawling? You can be fairly liberal with internal links as they provide useful context, but don’t go overboard with them as it will risk a penalty.
- Your breadcrumb setup. Are pages nested correctly and set in the right categories? Maintaining a logical internal structure is very important for showing search engines that your pages are worth indexing.
While this won’t require you to become an IT expert, you will need to either get to grips with all of these concepts or consult with someone who knows exactly how to check these things and make any required changes.
Where indexing fits in your SEO strategy
When companies think about how SEO plays into their marketing strategies, they tend to view technical SEO considerations as low-priority concerns. Other SEO considerations such as content marketing or social media outreach are more creative and thus viewed as more glamorous and interesting.
The problem with this line of thinking, of course, is that overlooking the technical fundamentals is extremely foolish. If you budget for a lengthy campaign of paid advertising, social media work, content production and brand advocacy, but fail to realize that you’re building authority around a page that cannot be indexed in Google, it will amount to a futile investment — as soon as the campaign ends, your traffic will all but disappear once again.
Using indexes for competitor research
Leaving aside your own indexing, there’s another aspect of search indexes that warrants a mention: competitor research. By looking at which pages your competitor’s index (and which ones they don’t), you can get an idea of what they’re doing and have the opportunity to reverse engineer their strategies.
Just think about how much information is readily available to you at no cost through simple Google searches. If you invest a little time in reviewing how other companies in your industry handle indexing, backlinks and search results in general, you’ll get a lot out of it.
Wrapping up how to check if your page is indexed
Building a website into a highly-competitive online presence is challenging at the best of times, no matter how much great content you have or how well you’re engaging with your audience. It takes time and consistency, and there are plenty of other excellent sites out there worth of being ranked above you if you let your standards slip.
Because you must invest so much time and effort into content, outreach, and UX development, it’s absolutely paramount that you ensure you aren’t held back by a fundamental technical issue like having important pages not indexed.
We’ve covered what indexing is, why it matters so much, and how you can take action to identify and overcome common indexing problems. The rest is up to you. Find the time to review your setup thoroughly, and schedule in semi-frequent reviews to make sure that your efforts aren’t undermined by avoidable technical problems.
Kayleigh Toyra: Content Strategist
Half-Finnish, half-British marketer based in Bristol. I love to write and explore themes like storytelling and customer experience marketing. I manage a small team of writers at a boutique agency.