Join 15,000 business owners, marketers and entrepreneurs. The Sunday newsletter you’ll be annoyed only arrives once a week.

Follow Lilach

Week 4: Crawled Not Indexed How I Triaged 1,219 Rejected Pages and What I Did With Each One

The centrepiece of this week was a full audit of my crawled not indexed report, 1,219 URLs Google had visited and decided weren’t worth showing in search results. What I found, and what I did about it, is the bulk of what this post covers. This is Week 4 of my rebuild-in-public series. In Week 1, I de-indexed 1,300 pages from my website, a decision that made several people question my sanity, including, briefly, me. In Week 2, my email open rate crashed to 11% because the entire email industry quietly changed its rules. I got it back to 46%. In Week 3, I went through five unglamorous technical tasks, 404 recovery, redirect chain cleanup, tag page noindexing, a crawled-not-indexed audit, and a full broken internal link fix across the site. The kind of work that sits on the to-do list for years until the damage is visible.

This week, the work shifted from infrastructure to content.

Which sounds more fun. It isn’t, necessarily. Because content decisions are harder than technical decisions. A 404 error either has a backlink or it doesn’t. A redirect chain either resolves in one hop or it doesn’t. Content is where the judgment calls live and judgment calls at scale are exhausting.

Here’s everything I did this week: a deep categorisation of 1,219 pages Google crawled but rejected, a triage decision on every post in the crawled-not-indexed report, a rescue operation on the best-performing posts from the 1,496 I noindexed in Week 1, a write-for-us content cluster built from scratch, and the start of the actual content refresh queue. Plus a full guide on how to do all of it yourself.

First, the context what the data showed

At the end of Week 3, I had the crawled-not-indexed export from Google Search Console: 1,219 URLs that Google had visited and decided weren’t worth indexing. The task I’d set myself was to categorise all of them properly and figure out what to do with each type.

The breakdown, once I’d gone through everything:

  • 419 tag pages handled by the code from Week 3 (noindex automatically applied to any tag with fewer than 5 posts)
  • 315 Tips posts  handled by the must-use plugin deployed in Week 1 (automatically noindexes everything in the Tips category)
  • 127 core content posts actual blog posts I wrote. Google looked at them and said no thanks. These became the content refresh priority list.
  • 104 junk or old pages  outdated, irrelevant, or duplicate pages with no redemption value. Noindex applied.
  • 19 pagination pages  /page/2/, /page/3/, etc. Handled by the plugin (noindexed automatically).
  • 16 category pages  thin category archives. Reviewed individually and either noindexed or left depending on post count.

The 127 core content posts are the most important number here. These are the posts that matter, posts Google crawled and actively rejected. That’s a content quality signal, not a technical one. And it’s where most of my time this week went.

Worth noting: The original export had some categorisation errors that needed correcting by cross-referencing against a WordPress content export. If you’re doing this audit yourself, always verify against your actual site data before making decisions. The URLs alone don’t tell you enough.

Task 1 – The Crawled, Not Indexed Triage

Why it matters

When Google crawls a page and doesn’t index it, it’s making a quality judgement. It’s not a technical problem, there’s no broken code or missing sitemap entry. Google simply looked at the page and decided it wasn’t worth showing in search results.

The reasons this happens fall into a few predictable categories, the content is too thin, it’s too similar to something else that already exists (on your site or elsewhere), it doesn’t match any meaningful search intent, it hasn’t been updated and the information is clearly stale, or the on-page quality signals structure, depth, authoritativeness aren’t strong enough.

Running this audit tells you exactly which of your posts need attention and prioritises the work. Without it, you’re guessing. With it, you have a list.

How to do it

1. Export the full list from GSC. Go to Google Search Console â†’ Indexing â†’ Pages â†’ Crawled, currently not indexed. Click Export in the top right. Download as a spreadsheet.

2. Sort alphabetically by URL. This makes it immediately obvious which URLs follow the same pattern  all tag pages will cluster together, all paginated URLs will cluster together, and so on.

3. Categorise by URL pattern. Add a column called ‘Type’ and batch-categorise:

  • URL contains /tag/ → Tag page
  • URL contains /category/ → Category page
  • URL contains /page/ → Pagination
  • URL matches your guest post category → Guest post (already handled if you’ve done Week 1 work)
  • Everything else → Core content (needs individual review)

4.  Cross-reference against your WordPress export. Export all published posts from WordPress (Tools → Export → All content). Open both spreadsheets. Match URLs from the GSC list against your WordPress post list. This lets you see the post title, category, and publish date alongside the URL which is essential for making triage decisions. Without this step, you’re looking at a column of URLs with no context.

5. Build a summary count by type. A simple pivot table or COUNTIF formula gives you the breakdown instantly. This tells you where the volume is and where to focus first.

6. Create a separate tab for core content only. Pull out every URL in the ‘Core content’ category into its own tab. Add columns for: post title, category, publish date, word count (if you can get it), and decision (Update / Merge / Noindex / Leave).

The triage decision framework

For each post in the core content list, the decision framework is:

  • Update – the topic is still relevant, the post has a chance of ranking, it just needs improving. Goes on the content refresh list.
  • Merge – you have two or three thin posts covering the same topic. Combine them into one stronger post and redirect the others to it.
  • Noindex – the topic is too outdated, too niche, or irrelevant to the current site direction. Apply noindex, leave the page live, remove from sitemap.
  • Leave – rare. Sometimes a post isn’t indexed yet simply because it’s very new or Google hasn’t crawled it properly. Check the publish date before deciding this.

My 127 core posts split into 40 as Priority 1 for content refresh (published 2022–2023, most likely to benefit from updating), 29 as Priority 2 (2020–2021, older but topics still valid), 48 as Priority 3 (pre-2020, review needed before deciding), and about 10 that were noindexed immediately because they were genuinely beyond saving.

Worth noting: When in doubt, update rather than noindex. A noindexed post generates no traffic and contributes nothing. An updated post might. The bar for noindexing a post you wrote should be high, reserve it for content that’s genuinely unsalvageable.

Task 2 – The Tips Post Rescue Operation

Why it matters

In Week 1, I noindexed 1,496 posts that were diluting the site’s topical authority and signalling to Google that the site was unfocused. Noindexing them en-masse was the right call.

But the problem about making decisions at scale is you occasionally noindex something that didn’t deserve it.

Before deploying the plugin, I needed to check whether any of those 1,496 posts were performing, getting clicks, impressions, or external backlinks. If they were, quietly noindexing them would mean giving up real traffic and real authority for no reason.

How to do it

1. Export GSC performance data for the category. In Google Search Console â†’ Search Results, filter by Page and use a URL filter to show only URLs containing your post category path. Set the date range to the last 12 months. Export.

2. Filter for any post with clicks or impressions. Sort by clicks descending. Anything with actual clicks is worth reviewing individually.

3. Cross-reference with your backlink data. In GSC â†’ Links â†’ Top linked pages, look for any category URLs. In Ahrefs or SEMrush (if you have access), run a backlink check on your domain and filter for the guest post category URL path.

4. Make a rescue list. Any post with meaningful clicks or external backlinks goes on the rescue list. These are posts that were performing, even slightly and shouldn’t have been caught in the mass noindex.

5. Move rescued posts to appropriate categories. For each rescued post, find the right category on your site (Content Marketing, SEO, Business, Tools, whatever fits) and change the category in WordPress. Once it’s out of the category, the plugin no longer applies and it becomes indexable again.

6. Submit for reindexing in GSC. Go to the URL Inspection tool in Search Console, paste each rescued URL, and click Request Indexing. This signals to Google that the page has changed and is ready for re-evaluation.

What this found

Of the 1,496 noindexed posts, approximately 37 had either clicks or backlinks worth paying attention to. Of those, 21 were worth rescuing, posts where the content was decent enough to stand on its own once moved to a proper category.

The remaining 16 had a backlink or two but the content wasn’t good enough to justify the rescue effort. Left them noindexed.

Five of the 21 rescued posts were specifically about guest blogging posts about how to write guest posts, how to find guest post opportunities, and so on. These were moved into the Content Marketing category, where they now sit alongside and link to the write-for-us cluster I built this week (more on that below). Relevant content finding its relevant home.

Worth noting: This step takes longer than it sounds. Categorising 37 posts individually, researching the right home for each, moving them, and submitting for reindexing is several hours of focused work. But it’s worth doing before deploying any mass noindex, you want to know what you’re giving up before you give it up.

Task 3 – Building the Write-For-Us Content Cluster

Why it matters

One of the ways www.lilachbullock.com generates revenue is through sponsored posts and link placements, brands and agencies paying to publish content on the site. Before you can earn from that, people have to be able to find the site when they’re looking.

How do people look? They Google things like ‘write for us digital marketing‘ or ‘submit guest post SEO blog’ or ‘write for us small business.’ They’re searching for sites that accept guest content.

My main /write-for-me/ page wasn’t capturing any of this traffic because it was a single page with no supporting content around it. No topical cluster, no internal linking, no signal to Google that this page was authoritative on the topic of guest posting.

A content cluster fixes that. You build a hub page (the main topic) and a set of spoke pages (specific subtopics), all interlinked. The hub passes authority to the spokes, the spokes reinforce the hub, and the whole cluster signals to Google that this site is genuinely covering the topic in depth.

How to do it

  1. Identify your hub page. This already exists, it’s whatever page you use to receive guest post or sponsored content submissions. Mine is /write-for-me/. This becomes the centre of the cluster.
  2. Research the search queries around the topic. Use Google Search or a keyword tool to find variations people search for. For a write-for-us cluster, the pattern is usually ‘[write for us / submit a post / guest post] + [niche or topic]’. Look for write for us digital marketing, write for us SEO, write for us AI, write for us small business, write for us email marketing, write for us productivity, and so on. These are your spoke topics.
  3. Write a spoke page for each topic. Each spoke page should explain what kind of content you accept in that niche, who the audience is, what makes a strong submission, and how to apply. It’s not just a landing page, it’s a useful piece of content for someone considering a guest post. Word count doesn’t need to be enormous, but 600–900 words of real substance is the minimum.
  4. Interlink everything deliberately. Every spoke page links to the hub page (/write-for-me/ or equivalent). The hub page links to every spoke. Each spoke links to at least two other spokes where there’s a natural connection. This creates the web of internal links that tells Google the cluster is coherent.
  5. Submit all new pages for indexing in GSC. As soon as they’re published, paste each URL into the GSC inspection tool and request indexing. New pages can take weeks to be crawled organically. A manual request shortens that.

What I built

Seven new posts in the write-for-us cluster, Technology, AI, and Small Business (which I wrote), plus Email Marketing, Social Media, and Productivity. All published, all interlinked with each other and with the main /write-for-me/ page, all submitted for indexing.

The commercial logic is direct. Someone at a SaaS company Googles ‘write for us AI tools’ looking for sites that cover AI content. This cluster is now competing for that search. If they find the page, they read about the audience and the submission process, and they either submit a sponsored post or they enquire. That enquiry is revenue.

It’s not passive income yet. But it’s passive pipeline, inbound enquiries from people who’ve already self-selected as interested. That’s the best kind.

Worth noting: If your site has a services page, a hire-me page, a courses page, or any other commercial page that isn’t surrounded by supporting content, this same cluster approach applies. The hub page doesn’t rank well in isolation. It ranks well when Google can see it’s the authoritative centre of a properly developed topic.

Task 4 Broken Internal Links and Orphan Pages

Why it matters

A broken internal link is a link on your site that points to a URL that no longer exists or returns an error. Every broken internal link is a small signal to Google that the site isn’t well-maintained. Across a site with 2,000+ posts and a decade of content accumulation, there were a lot of them.

Orphan pages are a related problem, pages that exist but have no internal links pointing to them. If no other page on your site links to a post, Google finds it by crawling your sitemap rather than by following links. It’s weaker authority, weaker context, and often a sign that the content isn’t properly integrated into the site.

How to do it – broken internal links

1. Crawl the site with Screaming Frog. The free version handles up to 500 URLs. If your site is larger, you’ll need the paid version ($249/year worth it for sites of any real size). Configure it to crawl internal links and report on response codes.

2. Export the broken link report. In Screaming Frog: Reports â†’ Bulk Export â†’ All Inlinks. Filter for status codes 404 (page not found) and 410 (gone). Also flag any 301 redirects, these aren’t broken, but they should be updated to point directly to the final destination rather than going through a redirect.

3. Build a fix spreadsheet. Columns: Source page (the page containing the broken link), broken URL, status code, fix type (Update or Remove), new URL. For each broken link: search the site for the closest matching live page using Google site:yourdomain.com [topic keyword]. If found: update. If not found: remove the link but leave the surrounding text intact, don’t delete content, only the hyperlink.

4. Fix the navigation first. Start with any broken links in your site navigation, header, or footer. These appear on every page, so one fix resolves the broken link on thousands of pages simultaneously. On my site, the Events page link in the main navigation (page_id=30) was broken, fixing that one link resolved 484 broken internal links in one go.

5. Work through the remaining individual links. These are links within post content, harder to batch, have to be done one by one. Prioritise by the authority of the source page (fix broken links on your most-linked pages first).

6. Verify by re-crawling. After fixes, re-crawl a sample of pages where changes were made and confirm the broken links are gone.

What this found on my site

751 broken internal links in total. The navigation fix (one broken link → 484 fixed) was the single most efficient action of the entire week. After that, individual broken links within post content, approximately 267 once the navigation was resolved.

About 120 of those individual links have been fixed. Around 140 remain and are being worked through systematically.

One note on what I skipped: there were 826 redirect links flagged in the audit, internal links that go to a URL that then redirects to another URL. Technically these should be updated to point directly to the final destination. But the time required to fix 826 links individually wasn’t justified by the gain. Redirect links don’t break the user experience and the authority loss is marginal, so I skipped this part.

Worth noting: Not every technical SEO recommendation needs to be acted on. The question is always, what’s the return on the time invested? 826 redirect links, each requiring individual manual editing, would take many hours for a modest SEO gain. 484 broken links resolved with a single navigation fix took ten minutes. Do the high-leverage work first.

How to do it – orphan pages

1. Export all internal links from Screaming Frog. Go to Bulk Export â†’ All Inlinks. This gives you every internal link on the site, source URL, destination URL, anchor text.

2. Create a destination URL frequency count. In your spreadsheet, do a COUNTIF on the destination URL column to see how many times each page is linked to from elsewhere on the site. Pages with a count of 0 are true orphans. Pages with a count of 1 are near-orphans.

3. Cross-reference with your content list. Match this data against your WordPress post export so you can see the post title alongside the link count. A post titled ‘Complete Guide to Email Automation’ with only one internal link pointing to it is a problem.

4. Save the single-link list as a reference document. You don’t need to fix all of these immediately. But every time you refresh an old post or write a new one, check this list. If the topic is relevant, add an internal link to one of the under-linked posts. You build the linking naturally over time rather than trying to fix 247 posts all at once.

What this found

No true orphan pages, every page had at least one internal link pointing to it. But 247 pages had only a single internal link, which is effectively the same problem. These pages are poorly integrated into the site’s internal linking structure and Google is likely not treating them with the authority they deserve.

I’ve saved this list. Every content refresh going forward uses it as a reference, if the topic of a post I’m refreshing is relevant to one of the 247 under-linked posts, I add a link. It costs almost no time and it steadily improves the site’s link architecture.

Task 5 The Content Refresh Queue (How I’m Updating Posts)

Why it matters

The 127 core content posts Google rejected aren’t there by accident. Google looked at them and made a quality decision. The content refresh is about understanding what that decision was based on and fixing it.

This is not copy-editing. It’s not changing the date on a post and resubmitting it. And it’s not a full rewrite for the sake of word count. It’s surgical, find what Google didn’t like, fix that specifically, leave what works.

How to work out why a post was rejected

Before changing a word, the first question is why did this post fail to index? The answer is almost always one of five things.

  • Too thin. The post exists but doesn’t cover the topic in enough depth to be useful. A 400-word post on a topic that warrants 1,500 words tells Google the site isn’t really covering that topic seriously.
  • Outdated information. Statistics from three years ago that are now wrong. Tools or platforms that no longer exist. Advice that’s been superseded by platform changes. Google’s helpful content guidance is explicit, it evaluates whether content reflects current best practice.
  • Wrong search intent match. You wrote about a topic in a way that doesn’t match how people search for it. Someone searching for ‘how to write a LinkedIn bio’ wants step-by-step instructions, not a philosophical piece about personal branding. If the format doesn’t match the intent, Google won’t rank it.
  • Cannibalisation. You have multiple posts competing for the same keyword. Google picks one to index and ignores the others. The solution is to merge the weaker posts into the strongest one and redirect the URLs.
  • Quality signals too weak. The post exists and is technically fine but the structure is poor, no headings, no clear answer to the question, thin introduction, no conclusion. Google’s quality evaluation isn’t just about word count. It’s whether the content looks like it was written to help someone.

Step 1: Check what’s currently ranking

For each post on the refresh list, search for the primary keyword in Google before touching anything. Look at the top five results. What format are they using? Are they step-by-step guides, comparison articles, definition posts, listicles? How long are they? What questions are they answering that my post isn’t?

This isn’t to copy anyone. It’s to benchmark. If every top result for ’email list segmentation’ is a detailed tutorial with code examples and every segment explained, and my post is three paragraphs of general advice, I know why mine isn’t in the index.

Step 2: Diagnose the specific problem

With the benchmark in mind, go back to the post and identify specifically what needs fixing. Don’t try to fix everything, fix what Google rejected it for. Add depth where it’s thin. Update what’s outdated. Restructure if the intent match is off. Merge if it’s cannibalising.

Step 3: Update the content

The actual editing. Depending on the diagnosis, this might mean:

  • Adding a proper introduction that clearly states what the post covers and who it’s for
  • Expanding thin sections, wherever you’re hand-waving over something the reader actually needs to understand
  • Replacing outdated statistics, tools, or references with current ones
  • Restructuring with clear headings that match how people search for the topic
  • Adding a conclusion that tells the reader what to do with the information
  • Adding examples, case studies, or specific how-to instructions where the post was previously vague

The target isn’t a specific word count. The target is does this post now answer the question better than what Google is currently ranking? If yes, it’s ready.

Step 4: Fix the internal linking on every post you touch

Every single post that goes through a content refresh gets its internal linking checked at the same time. Two questions:

  • Does this post link to relevant content elsewhere on the site? If not, add links where there’s a natural connection.
  • Are there other posts on the site that should link to this one but don’t? Cross-reference against the 247 single-link pages list. If this post is on that list, find at least two other posts to link from.

Internal linking is one of the highest-leverage things you can do while you’re already in a post. You’re already there, you already have it open. Adding or fixing an internal link takes two minutes and strengthens the entire site’s structure. Not doing it while you’re there is leaving easy value on the table.

Step 5: Update the SEO metadata

Title tag and meta description. Before saving, check:

  • Does the title tag include the primary keyword naturally?
  • Does the meta description accurately describe what the post delivers, not what you wish it delivered?
  • Would you click on this if you saw it in a search result?

I use Yoast for this. The Yoast snippet editor shows exactly how the title and description will appear in Google. If the title is being truncated, shorten it. If the meta description reads like keyword soup, rewrite it in human. The meta description isn’t a direct ranking factor, but a higher click-through rate is, and click-through rate starts with the snippet.

Step 6: Request reindexing

After every updated post, go to Google Search Console → URL Inspection, paste the URL, and click Request Indexing. This doesn’t guarantee immediate indexing but it does put the page back on Google’s crawl queue faster than waiting for the regular cycle. For a post that’s been languishing in the crawled-not-indexed report, this is the signal that something has changed and is worth re-evaluating.

The prioritisation logic

I’m not working through the 127 posts randomly. The priority order is:

  • Priority 1 (40 posts, 2022–2023): Most recent, most likely to be close to Google’s quality threshold already, most likely to respond quickly to updating.
  • Priority 2 (29 posts, 2020–2021): Older and will need more significant updates, but the topics are still relevant.
  • Priority 3 (48 posts, pre-2020): Require a judgment call on each one before updating, some will be updated, some merged, some noindexed.

Starting with the newest content means more posts refreshed before the older ones require the heavier lifting. It’s also where the quickest wins tend to be.

Worth noting: Content refreshing has a compounding effect. A post that gets reindexed after a refresh might start getting impressions. Impressions turn into clicks. Clicks build engagement signals. Engagement signals tell Google to give the page more visibility. The first wave of refreshes doesn’t produce the result, the accumulated effect of 40, 80, 120 refreshed posts does.

What’s moving – the honest numbers

I want to give you the real version, not the cleaned-up-for-the-blog version.

The SEO work is progressing. The results aren’t visible yet. This is expected, Google doesn’t respond to technical cleanup immediately. The guidance is 6–8 weeks from when the changes are in place to seeing meaningful movement in impressions and crawl behaviour. The noindex work deployed on February 12th. I’m in week 4. I’m watching, not expecting.

My baseline metrics, for anyone tracking along:

  • Organic search sessions: approximately 2,753 over 54 days (12% of traffic, up from 9.6%)
  • GSC clicks: 4,020 / impressions: 598,000 over three months
  • Average position: 27.3
  • Indexed pages vs not-indexed: 1,810 indexed vs 4,580 not indexed, this ratio should shift significantly as Google works through the noindex queue

The biggest thing to watch is the indexed/not-indexed ratio. Right now, more than twice as many pages are not indexed as indexed. As Google processes the noindex signals from the plugin deployed, that ratio should flip, fewer not-indexed pages, better-focused indexed pages, clearer topical signal. That’s the outcome everything so far is pointing toward.

I’ll share updated numbers in Week 5. If something moves dramatically before then, I’ll say so.

If you want to run this process on your own site

You don’t need a site with thousands of posts for any of this to be worth doing. Even 30 or 40 posts sitting in the crawled-not-indexed report is enough to justify the audit and the triage process.

The sequence that makes sense:

1. Run the crawled-not-indexed audit first. GSC â†’ Indexing â†’ Pages â†’ Crawled, currently not indexed. Export it. This is your diagnostic data.

2. Categorise by URL pattern. Sort, batch-categorise, identify your core content list.

3. Triage before touching anything. Go through the core content list and make a decision on each post: Update, Merge, Noindex, or Leave. Don’t start editing until you’ve got the full picture.

4. Fix the navigation and site-wide links. Check for broken links in your header, footer, and navigation. One fix here can resolve hundreds of broken links across the site.

5. Start the content refresh queue. Newest content first. Benchmark against what’s ranking. Diagnose the specific problem. Fix that. Submit for reindexing.

6. Do the internal linking as you go. Every post you open, check the linking in and out. It takes almost no additional time and the cumulative effect is significant.

The work isn’t glamorous. None of this is. But it’s the kind of thing that compounds quietly in the background and produces results that look, from the outside, like a sudden improvement. There’s no sudden improvement. There’s just a lot of careful, systematic work that eventually tips the scales.

What’s next

Week 5 is going to be the first proper data check-in, enough time should have passed for the noindex work to start showing up in GSC’s indexing data. I want to see whether the indexed/not-indexed ratio has shifted, whether impressions have started moving, and whether any of the rescued posts or write-for-us cluster pages have been picked up.

I’m also going to look at the Looker Studio dashboard I’ve set up to track SEO, email, and social in one place, a proper monitoring system rather than logging into three separate tools every time.

Same as every week, real numbers, what I’m doing and why, no tidying it up.

If you’re doing any of this on your own site, or you’ve hit a specific issue I haven’t covered, comment below. I read all of them.

Catch up on the full series:

  • Week 1: I’m De-Indexing 1,300 Pages to Save My Website (And Why This Might Be Your Only Option Too)
  • Week 2: My Open Rate Crashed to 11% and I Didn’t Even Do Anything Wrong
  • Week 3: The Unglamorous SEO Work Nobody Talks About (But Everyone Needs to Do)

Follow Lilach

In this post:


About Lilach Bullock

Hi, I’m Lilach, a serial entrepreneur! I’ve spent the last 2 decades starting, building, running, and selling businesses in a range of niches. I’ve also used all that knowledge to help hundreds of business owners level up and scale their businesses beyond their beliefs and expectations.

I’ve written content for authority publications like Forbes, Huffington Post, Inc, Twitter, Social Media Examiner and 100’s other publications and my proudest achievement, won a Global Women Champions Award for outstanding contributions and leadership in business.

My biggest passion is sharing knowledge and actionable information with other business owners. I created this website to share my favorite tools, resources, events, tips, and tricks with entrepreneurs, solopreneurs, small business owners, and startups. Digital marketing knowledge should be accessible to all, so browse through and feel free to get in touch if you can’t find what you’re looking for!


Popular Articles:


Want help applying this to your business?