Skip to content(if available)orjump to list(if available)

Google de-indexed Bear Blog and I don't know why

firefoxd

Traffic to my blog plummeted this year and you can never be entirely sure how it happened. But here are two culprits i identified.

1. Ai overview: my page impressions were high, my ranking was high, but click through took a dive. People read the generated text and move along without ever clicking.

2. You are now a spammer. Around August, traffic took a second plunge. In my logs, I noticed these weird queries in my search page. Basically people were searching for crypto and scammy websites on my blog. Odd, but not like they were finding anything. Turns out, their search query was displayed as an h1 on the page and crawled by google. I was basically displaying spam.

I don't have much control over ai overview because disabling it means I don't appear in search at all. But for the spam, I could do something. I added a robot noindex on the search page. A week later, both impressions and clicks recovered.

Edit: Adding write up I did a couple weeks ago https://idiallo.com/blog/how-i-became-a-spammer

dazc

Sounds like point 2 was a negative seo attack. It could be that your /?s page is being cached and getting picked up via crawlers.

You can avaoid this by no caching search pages and applying noindex via X-robots tag https://developers.google.com/search/docs/crawling-indexing/...

motbus3

I posted some details in the main thread but I think you might need to check the change in methodology of counting impressions and clicks Google did around September this year.

They say the data before and after is not comparable anymore as they are not counting certain events below a threshold anymore. You might need to have your own analytics to understand your traffic from now own.

bootsmann

Sorry but how did 2 work before you fixed it? You saved the queries people did and displayed them?

firefoxd

So the spammer would link to my search page with their query param:

    example.com/search?q=text+scam.com+text
On my website, I'll display "text scam.com text - search result" now google will see that link in my h1 tag and page title and say i am probably promoting scams.

Also, the reason this appeared suddenly is because I added support for unicode in search. Before that, the page would fail if you added unicode. So the moment i fixed it, I allowed spammers to have their links displayed on my page.

Calavar

Reminds me of a recent story on scammers using search queries to inject their scam phone numbers into the h1 header on legitimate sites [1]

[1] https://cyberinsider.com/threat-actors-inject-fake-support-n...

Neil44

Interesting - surely you'd have to trick Google into visiting the /search? url in order to get it indexed? I wonder if them listing all these URLs somewhere are requesting that page be crawled is enough.

Since these are very low quality results surely one of Google's 10000 engineers can tweak this away.

chii

i imagine the search page echoed the search query. Then, a SEO bot automated search(s) on the site with crypto and spam keywords, which is echo'ed in the search results - said bot may have a site/page full of links to these search results to create fake pages for those keywords for SEO purposes (essentially, an exploit).

Google got smart and found out such exploits, and penalized sites that do this.

FuturisticLover

Google search results have gone shit. I am facing some deindexing issues where Google is citing a content duplicate and picking a canonical URL itself, despite no similar content.

Just the open is similar, but the intent is totally different, and so is the focus keyword.

Not facing this issue in Bing and other search engines.

daemonologist

I've also noticed Google having indexing issues over the past ~year:

Some popular models on Hugging Face never appear in the results, but the sub-pages (discussion, files, quants, etc.) do.

Some Reddit pages show up only in their auto-translated form, and in a language Google has no reason to think I speak. (Maybe there's some deduplication to keep machine translations out of the results, but it's misfiring and discarding the original instead?)

sischoel

The issues with auto-translated Reddit pages unfortunately also happens with Kagi. I am not sure if this is just because Kagi uses Google's search index or if Reddit publishes the translated title as metadata.

I think at least for Google there are some browser extensions that can remove these results.

black_puppydog

The Reddit issue is also something that really annoys me and i wish kagi would find some way to counter it. Whenever I search for administrational things I do so in one of three languages, German, French or English depending on which context this issue arises in. And I would really prefer to only get answers that are relevant to that country. It's simply not useful for me to find answers about social security issues in the US when I'm searching for them in French.

Aldipower

Yeah, Google search results are almost useless. How could they have neglected their core competence so badly?

adaptbrian

B.c they shifted their internal KPI in 2018 roughly, to keeping users on Google and not tuning towards users finding what they are looking for ie. Clicking off google.

This is what has caused the degradation of search quality since then.

dev_l1x_be

Amazong, Google is the same. Fake products, fake results, scammers left and right.

bjt12345

What I find strange about Google, is that there's a lot of illegal advertising on Google maps - things like accomodation and liquor sellers that don't have permits.

However, if they do it for the statutory term, they can then successfully apply for existing-use rights.

Yet I've seen expert witnesses bring up Google pins on Maps during tribunal over planning permits and the tribunal sort of acts as if it's all legit.

I've even seen the tribunals report publish screenshots from Google maps as part of their judgement.

rcxdude

Is it treated differently from other kinds of advertising? A lot of planning and permitting has a bit of a 'if it's known about and no-one's been complaining it's OK' kind of principle to it.

oakwhiz

legal citogenesis?

actionfromafar

Clan justice, google is the clan.

01HNNWZ0MV43FF

Reality is just tug of war and weight is all that matters at the limit

Aldipower

Google search also favors large, well-known sites over newcomers. For sites that have a lot of competition, this is a real problem and leads to asymmetry and a chicken-and-egg problem. You are small/new, but you can't really be found, which means you can't grow enough to be found. At the same time, you are also disadvantaged because Google displays your already large competitors without any problems!

hyruo

I encountered the same problem. I also use the Bear theme, specifically Hugo Bear. Recently, my blog was unindexed by Bing. Using `site:`, there are no links at all. My blog has been running normally for 17 years without any issues before.

motbus3

Without going into details. The company I work for has potentially millions of pages indexed. Despite new content being published everyday, since around the same October dates we are seeing a decrease in the number of indexed pages.

We have a consultant for the topic but I am not sure how much of that conversation I could share publicly so I will refrain myself of doing so.

But I think I can say that it is not only about data structure or quality. The changes in methodology applied by Google in September might be playing a stronger role than what people initially thought

p410n3

What "changes in methodology applied by Google in September" are you referring to? There surely is a public announcement that can be shared? Most curious to hear as a shop I built is experiencing massive issues since august / september 2025

econ

I never really use it but there is a lot in the Yahoo index that google refuses to index.

https://search.yahoo.com/search?p=blog.james-zhan.com&fr=yfp...

guerrilla

I thought Yahoo! was just Bing now. The real Yahoo! died ages ago.

scosman

How does one debug issues like this?

I have a page that ranks well worldwide, but is completely missing in Canada. Not just poorly ranked, gone. It shows up #1 for keyword in the US, but won't show up with precise unique quotes in Canada.

graeme

Entirely possible the rss failed validation triggered some spam flag that isn't documented, because documenting anti-spam rules lets spammers break the rules.

The amount of spam has increased enormously and I have no doubt there are a number of such anti-spam flags and a number of false positive casualties along the way.

Eisenstein

If failing to validate a page because it is pointing to an RSS feed triggers a spam flag and de-indexes all of the rest of the pages, that seems important to fix. By losing legit content because of such an error they are lowering the legit:spam ratio thus causing more harm than a spam page being indexed. It might not appear so bad for one instance, but it is indicative of a larger problem.

qwertox

When I reload the page "https://journal.james-zhan.com/google-de-indexed-my-entire-b...", I get

Request URL: https://journal.james-zhan.com/google-de-indexed-my-entire-b...

Request Method: GET

Status Code: 304 Not Modified

So maybe it's the status code? Shouldn't that page return a 200 ok?

When I go to blog.james..., I first get a 301 moved permanently, and then journal.james... loads, but it returns a 304 not modified, even if i then reload the page.

Only when I fully sumbit the URL again in the URL-bar, it responds with a 200.

Maybe crawling also returns a 304, and Google won't index that?

Maybe prompt: "why would a 301 redirect lead to a 304 not modified instead of a 200 ok?", "would this 'break' Google's crawler?"

> When Google's crawler follows the 301 to the new URL and receives a 304, it gets no content body. The 304 response basically says "use what you cached"—but the crawler's cache might be empty or stale for that specific URL location, leaving Google with nothing to index.

jorams

You get a 304 because your browser tells the server what it has cached, and the server says "nothing changed, use that". In browsers you can bypass the cache by using Ctrl-F5, or in the developer tools you can usually disable caching while they're open. Doing so shows that the server is doing the right thing.

Your LLM prompt and response are worthless.

dazc

Breaking News: Google de-indexes random sites all of the time and there is often no obvious reason why. They also penalize sites in a way where pages are indexed but so deep-down that no one will ever find them. Again, there is often no obvious reason.

p410n3

Do you have any resources here? The /r/seo subreddit seems vers superficial coming from an web agency background so its hard to find legit cases versus obvious oversights. Often people make a post describing a legit sounding issue on there just to let it shine through that they are essentially doing seo spam.

quietfox

I'll be honest, I read "Google de-indexed my Bear Blog" and was looking forward to discovering an interesting blog about bears.

xeonmc

You may find rather unexpected results if you look for blogs with an interest in bears.

binarymax

Same. I still don’t know why the word “Bear” was used in the title.

Bengalilol

Coming from a quietfox, it is OK. It is important to preserve oneself ^^.