Follow Lilach
I’m De-Indexing 1,300 Pages to Save My Website
This blog post is the first of my new weekly series where I’m rebuilding my business in public. For the last 10 years, I’ve relied on passive income from my website. But over the last 6-9 months, that income has dwindled to almost nothing. So, I’m rebuilding and documenting everything as I go, as it’s actually happening. Every week I’ll share what I’m doing, what’s working, and what isn’t. Sometimes it will be more technical SEO things like today’s blog post. Other times it might be how I’m growing my email list or testing what works on social media.
Today’s topic gets a bit technical, but stick with me. Even if you’re not into SEO, the same principle matters, sometimes you have to deliberately shrink something to make it grow again. And the commercial reason I’m doing this affects anyone trying to monetize a website or get sponsorships.
This week I’m deliberately removing over 1,300 pages from Google’s index.
It sounds insane. It probably is. But let me explain why I believe this is the right move.
When Your Website Becomes a Graveyard
I haven’t properly worked on my website in years. Life happened, priorities shifted, and the site just… existed.
During that time, organic traffic slowly bled out. Not because of a penalty or an algorithm update. Just gradual decline from neglect.
But here’s what cost me the income: the site stopped passing the automated checks that agencies and sponsors use to decide if you’re worth their time.
This site used to generate a passive income of $5,000 to $7,000 per month from sponsored posts and link placements. That revenue has now dried up completely. Not because my traffic disappeared, but because tools like Ahrefs and SEMrush showed my organic traffic as much lower than it actually was. When a sponsor checks your domain and sees weak estimated traffic, the conversation ends before it starts. It doesn’t matter what your Google Analytics shows. Nobody asks for screenshots. They trust the tools.
So I had two problems, declining traffic, and tools that made the traffic look even worse than it was.
What I Found When I Actually Looked at the Data
I finally opened Google Search Console for the first time in months. The numbers were brutal.
Out of over 6,000 pages that Google knew about:
- 1,810 were indexed
- 4,580 were not indexed
That’s a massive red flag. When you have more than twice as many unindexed pages as indexed ones, Google is telling you something, it can’t figure out what your site is actually about.
The reasons pages weren’t being indexed told the whole story:
- 1,110 pages with redirect issues
- 513 duplicate canonical problems
- 1,219 pages crawled but not indexed
- 1,154 discovered but not indexed
This is what indexation bloat looks like. The site had accumulated over 4,000 articles over a decade.
About 1,300 of those articles covered topics that no longer align with where I’m taking the site. They weren’t bad content necessarily, but they were about social media tactics from 2014, tools that don’t exist anymore, strategies that are outdated. They made sense at the time. They just don’t fit the direction anymore.
But they were all still there, indexed, making it harder for Google to figure out what my site is about now.
Google Search Console showing the indexation problems
Why This Kills SEO Performance (The Technical Explanation)
When Google crawls your site, it’s trying to understand what you’re about so it knows when to show your pages in search results.
This assessment happens at multiple levels:
1. Topic clustering
Google looks at the topics you cover and tries to determine your areas of expertise. If you have 200 articles about social media marketing from 2014 mixed with 50 articles about AI tools from 2024, your topical authority gets diluted. You’re not an expert in anything specific. You’re just… scattered.
2. Content quality signals
When a significant portion of your indexed content is thin, outdated, or off-topic, it affects how Google perceives your entire site. The algorithm doesn’t evaluate each page in isolation. It’s looking at patterns across your whole domain.
3. Crawl budget allocation
Google doesn’t crawl every page on your site every day. It allocates crawl budget based on perceived value. When you have thousands of low-value pages, Google wastes time crawling content that doesn’t matter instead of focusing on your best material.
4. Link equity distribution
Every internal link on your site passes a small amount of authority. When you have 4,000 pages all linking to each other through navigation, sidebar widgets, and related post modules, that authority gets spread incredibly thin. Your best content isn’t getting the internal linking power it deserves.
And here’s the commercial problem, third-party SEO tools like Ahrefs, SEMrush, and Moz sample your indexed pages to estimate your traffic. When they crawl your site and find a bunch of old, low-traffic pages mixed in with your active content, their algorithms assume your overall traffic is lower than it actually is.
This creates a death spiral for monetization. Lower estimated traffic means fewer sponsorship opportunities. Fewer opportunities means less revenue. Less revenue means less investment in content. Less investment means more decline.
What I’m Actually Doing About It
So here’s what I’m doing. I’m deliberately removing those 1,300+ outdated articles from Google’s index.
Not deleting them. Not redirecting them. Just de-indexing them.
The pages will still exist. The URLs will still work. But Google will stop including them in its assessment of what my site is about.
This should accomplish three things:
First, it will clarify my topical focus. Once the older content is out of the index, Google will have a clearer picture of what I cover now. My current content is focused on helping people grow their businesses, AI, tools, marketing, sales, productivity, social media, all of it. That’s what I want to be known for.
Second, it will improve crawl efficiency. Google will stop wasting resources on content that doesn’t matter and can focus on crawling and ranking my best material.
Third, it should improve how third-party tools perceive my traffic. When Ahrefs samples my indexed pages, they’ll see higher-quality content that gets traffic, which should improve my estimated organic visits.
That last point matters because my goal isn’t just to recover traffic. It’s to pass the threshold where tools like SEMrush show me above 1,000 organic visits per month. That’s the number where commercial conversations become possible again.
How to Actually Do This (Step-by-Step Implementation)
If you’re facing a similar situation, here’s exactly how to implement this strategy. I’m using WordPress with the Yoast SEO plugin, but the principles apply to any platform.
Step 1: Create Your De-Indexation Rule
Before you touch anything, you need a clear rule for what gets de-indexed. This removes the paralysis of deciding page by page.
For me, the rule was simple: anything in a specific category that held older content from a different era of the site gets de-indexed. Everything else stays.
Your rule might be different:
- Any post published before a certain date
- Any post with less than X pageviews in the last 12 months
- Any post tagged as a specific content type
The key is making it systematic. You don’t want to be making quality judgments on 1,000+ posts individually.
Step 2: Export Your Content Inventory
You need to see what you’re working with. Export a complete list of all your posts with these columns:
- Post URL
- Post title
- Category or tag
- Publish date
- Current index status (if you can get it from GSC)
In WordPress, you can do this with a plugin like WP All Export, or if you’re comfortable with WP-CLI, you can export directly from the database.
This spreadsheet becomes your source of truth. It shows you exactly how many pages you’re dealing with and confirms that your rule is catching the right content.
Step 3: Apply Noindex Tags
Now for the actual implementation. You’re going to add a noindex meta tag to every post that matches your rule.
A noindex tag tells search engines: ‘This page exists and you can crawl it, but don’t include it in your search results.’
The tag looks like this in the HTML head section:
<meta name=”robots” content=”noindex, follow”>
The ‘follow’ part means search engines can still follow links on the page, which is important for maintaining some link equity flow.
If you’re using Yoast SEO, here’s how to do this efficiently:
Option A: Individual posts
1. Edit the post
2. Scroll to the Yoast SEO meta box
3. Click the gear icon (advanced settings)
4. Set ‘Allow search engines to show this post in search results’ to No
5. Update the post
Option B: Bulk editing (faster for hundreds of posts)
If you’re dealing with hundreds of posts, individual editing isn’t practical. You have two options:
1. Use a database query to update the Yoast post meta directly (advanced, requires database access)
2. Hire a VA to do it manually with clear documentation
For the database approach, Yoast stores the noindex setting in the postmeta table under the key ‘_yoast_wpseo_meta-robots-noindex’. You can bulk update posts in a specific category with a SQL query, but I strongly recommend testing on a staging site first.
Step 4: Remove Pages from XML Sitemap
Your XML sitemap tells search engines which pages you think are important enough to index. If you’re de-indexing content, you should remove it from your sitemap too.
This creates a consistent signal: noindex tag says ‘don’t index this’ and the sitemap says ‘this isn’t a priority.’
Good news: if you’re using Yoast, this happens automatically. When you set a post to noindex, Yoast removes it from your XML sitemap.
To verify this is working:
1. Go to yoursite.com/sitemap_index.xml
2. Click through to your post sitemap
3. Check that the noindexed posts aren’t listed
If you’re using a different SEO plugin or custom sitemap generation, you’ll need to manually exclude the de-indexed content.
Step 5: Clean Up Internal Links
This step is optional but recommended. You want to remove obvious internal links pointing to the de-indexed content.
I’m not talking about every single link. That’s not practical. But you should clean up:
- Category page archives
- Sidebar ‘recent posts’ or ‘popular posts’ widgets
- Footer link sections
- Related posts modules
The goal is to avoid situations where Google crawls your homepage, sees links to de-indexed content in the sidebar, and gets confused about whether that content actually matters.
For category pages, I’m using a simple approach: if the category only contains de-indexed posts, I’m setting the category archive itself to noindex as well. No point having an index page for content that doesn’t exist in search results.
Step 6: Document Everything
Keep a record of what you did and when. At minimum, save:
- Your complete URL list with before/after status
- Screenshots of GSC indexing stats before you start
- Screenshots of third-party tool estimates (Ahrefs, SEMrush) before you start
- The date you made the changes
This documentation serves two purposes. First, if something goes wrong, you can revert. Second, it gives you a baseline to measure results against.
What I’m Watching For (Success Metrics)
I’m not expecting overnight miracles. This is a 6-8 week experiment with very specific metrics.
Here’s what I’m tracking in Google Search Console:
Week 1-2: Number of indexed pages should drop significantly. This is the expected outcome. I want to see it fall from 1,810 to around 500-600 indexed pages (only my core content).
Week 2-4: The ‘Crawled – currently not indexed’ number should start falling. This indicates Google is respecting the noindex tags and stopping its attempts to crawl that content.
Week 4-6: Impressions in search results should stabilize and potentially start rising. Impressions are the leading indicator. They tell me Google is starting to understand what my site is about and is showing it for relevant queries.
Week 6-8: Actual traffic (clicks) should follow. Traffic is a lagging indicator. It responds to impressions, which respond to clarity about site focus.
Current GSC performance showing the baseline traffic levels
But the real test is commercial, will Ahrefs and SEMrush start showing my organic traffic above 1,000 monthly visits?
That’s the threshold where sponsorship conversations become viable again. Below 1,000, you’re not on their radar. Above it, you’re a potential partner.
If I don’t see meaningful movement in these metrics by week 8, I’ll reconsider the approach. But I’m optimistic this will work because the underlying logic is sound: clarity beats volume.
The Risks (What Could Go Wrong)
I need to be honest about the downside scenarios.
Risk 1: Some of those older articles might still be getting traffic. By de-indexing them, I’m voluntarily giving up that traffic. I’ve looked at the data and I don’t think this is a significant issue, but there’s always the chance I’m wrong.
Risk 2: Those posts might have backlinks I’m not aware of. De-indexing doesn’t remove the backlinks, but it does mean Google can’t pass authority through those pages to the rest of my site.
Risk 3: Google might interpret this massive de-indexing as a sign the site is dying rather than being cleaned up. The algorithm could decide I’m abandoning the site and reduce its overall trust.
Risk 4: Third-party tools might not update their estimates quickly enough. Even if Google responds well, it could take months for Ahrefs to recrawl my site and adjust their traffic projections.
These are all risks. But here’s why I’m doing it anyway, the current situation is already a slow-motion failure. Traffic is declining. Revenue is gone. Doing nothing guarantees continued decline.
This de-indexation strategy is a calculated bet that clarity and focus will outperform volume and breadth. I think that bet is sound in 2026, especially as AI-powered search continues to prioritize topical authority.
Who Should Consider This Approach
This strategy isn’t for everyone. If your traffic is stable or growing, don’t touch anything.
But if you’re in a similar situation to mine, this might be worth considering:
- Your traffic has been declining for 12+ months with no clear cause
- You have hundreds or thousands of old posts that don’t match your current focus
- Google Search Console shows more unindexed pages than indexed pages
- Third-party tools are underestimating your traffic and it’s affecting monetization
- You’re willing to trade short-term traffic risk for long-term clarity
The key question is: do you have a clear line between content that matters and content that doesn’t? If you can’t draw that line, this approach won’t work. You’ll just be randomly de-indexing pages and hoping for the best.
But if you can create a systematic rule that separates signal from noise, this is a viable recovery strategy.
What Happens Next
I’m documenting this entire process publicly because I want to see if the conventional wisdom about more content is better still holds true.
My hypothesis is that in 2026, with AI changing how search engines evaluate content, focus and clarity matter more than volume. Topical authority beats broad coverage.
Over the next 8 weeks, I’ll share updates on what’s happening:
- How quickly Google de-indexes the tagged content
- Whether impressions and traffic respond
- What third-party tools start showing
- Any unexpected consequences good or bad
This is an experiment with actual stakes. If it works, I’ll have rebuilt a monetizable traffic base. If it doesn’t, I’ll document what went wrong so you can avoid the same mistakes.
Next week, I’ll write about the lead magnet system I’m building to grow my email list while organic traffic recovers. That’s the other half of the rebuild, don’t rely on search traffic alone.
Same approach. Just showing you what I’m doing.
Follow Lilach