Global population datasets underrepresent rural population
112 comments
·March 18, 2025wongarsu
MarkusQ
They assume that people were only relocated from the areas later covered by water. It seems quite plausible that the actual area of relocation was somewhat larger than that subsequently covered by water, which could explain the results.
HPsquared
For example if the road to town is underwater, your home is now unusable.
markovs_gun
Yeah wouldn't the floodplains shift as well? I looked at local flood plains when I was looking for a house and I don't want to live in a flood prone area.
hinkley
China got rid of their rules for how many children you can have a long time ago but I suspect such trauma doesn’t leave a population easily.
In a world where people are considered illegal, you’ll have lots of people trying to leave no footprints and far away from seats of authority where your odds of drawing attention on accident are much better.
See also the trope of the quiet stranger who moves to a rural town and keeps to himself. Does he just like the peace and quiet or is there a warrant out for his arrest in South Carolina?
kspacewalk2
China replaced a one-child policy with a two-child policy in 2015, which was replaced by a three-child policy in 2021. So, far from getting rid of rules for how many children you can have a long time ago, they relaxed rules for how many children you can have a short while ago. Also, China isn't rural Alaska, it is a centralized authoritarian surveillance state and not the kind of place one can go off the grid and raise a family. Not that a non-vanishingly-small number of people is doing that anywhere, other than by happenstance in the poorest countries where "off the grid" is the default.
bobthepanda
As the saying goes in China "the mountains are high and the emperor is far away."
China is certainly an impressive surveillance state but the reality is that the geography is so large that it is possible to miss things like human trafficking: https://www.nytimes.com/interactive/2025/03/05/world/asia/xu...
lolinder
> Also, China isn't rural Alaska, it is a centralized authoritarian surveillance state and not the kind of place one can go off the grid and raise a family.
Through some combination of state propaganda (their own and China's) and dystopian fiction like 1984, Westerners drastically overestimate the capabilities of the Chinese state. It would be more accurate to say that China wants to be perceived as a centralized authoritarian surveillance state, because that perception is valuable both domestically and abroad. They absolutely do have a broad surveillance apparatus, but they also have millions of square miles and more than a billion people to surveil and control.
Don't forget that we're talking about the government that has been unable to pull together either the political will or the state power to enforce health codes that would be considered table stakes for any developed democracy, ignoring or tolerating disease-spreading wet markets. They appear to pick and choose their authoritarian battles, and if you're not a battle they've chosen you can probably go ignored for a lifetime.
RegnisGnaw
Its actually not that uncommon. China is not a monolithic entity. Rules are flexible and always have work around. In some cases, rural families that have a girl as a first kid will try again for a boy. They just don't register the girl for hukou.
makeitdouble
> it is a centralized authoritarian surveillance state and not the kind of place one can go off the grid and raise a family.
The logistics of centrally surveilling a massive area of land are pretty tough, much more tougher than we imagine I think.
Historically independency seeking people learn how to flee from aggressive governments and move to places that make it harder to track them. Including cutting down on written documents that could expose their org., adopting social structures that decentralize leadership etc.
It was fascinating to me, I highly recommend checking the Zomia region.
e.g.:
https://medium.com/@matthijsbijl/welcome-to-zomia-the-anarch...
lelandfe
The three-child policy was lifted just months later after some outcry: https://fortune.com/2021/07/21/china-three-child-policy-decl...
wodenokoto
It’s been just over a decade since I lived in China in a tier-1 city, but back then it was common to hear things like rural people would have multiple children, with only one registered.
hinkley
When we are speaking of children, time is on another scale. An 18 year period creates whole adults from nothing.
2015 means there are legal 10 years olds that would have not been legal before. And I expect some grey area 11 and 12 year olds that are surprisingly mature for their age.
csomar
This is hyperbole. It took China until a few years ago to be able to fully control crypto mining which was happening at industrial scale in China. My guess is that the Chinese accepted the draconian one child policy because the famine collectively affected them. They do not accept a lot of other things and the state is not as powerful as you think it is.
Another case is covid. The state tried to maintain the lockdown (which was partially supported by some of the population) but at a certain point they can no longer maintain the dissatisfaction and had to fully open back up.
IncreasePosts
Weren't rural farmer types in china exempted from the one child rules? And all 50-something recognized non-Han minority groups?
foxglacier
China's authority reaches deep down to the community and office level. They have party members embedded everywhere keeping an eye on their neighbors and co-workers. I'd be surprised if they left any big rural areas un-surveilled.
tehjoker
It would be cool if any of our parties would give a fuck about how any of us are doing, but they don't. The communist party is supposed to be in tune with the people, and you can only do that by being with the people to understand what they need. Western propaganda pathologizes this to benefit capitalist ideology, because capitalists only are with the people to control them and make them work faster.
econ
No one knows how many Chinese people don't exist on paper and how many have multiple [Chinese] passports. There is reason to think the numbers are very large.
null
null
thadk
The gridded population tools I've used, including Worldpop, seemed to start with the national census data. Only with that do they use the various other measures as a way to subdivide the census data administrative units.
I'm not sure if any of the mentioned gridded population tools are substantially census independent. Haven't read the paper yet, but maybe that could be one reason why methods are so similar.
null
araes
Part that got me midway through was the suspicion (confirmed at the end) that maybe the current 8,000,000,000 estimates for world population numbers have some rather large inaccuracies.
From the discussion:
> how reliable current global population estimates really are. For example, is it possible that global population estimates from the United Nations (7.98 billion in 2022) or World Bank (7.95 billion in 2022), both relying heavily on national population censuses, miss a significant part of the world population?
PostOnce
It seems like we could get some good estimates based on non-census data.
For example cellular data or ad data, rf emissions, CO2 or other chemical emissions, satellite photography of human activity (mowing, cropping, logging, building, etc), food or fuel shipments, etc.
A census may miss people, but the rest of the data wouldn't.
svnt
Almost none of this exists for these time periods and populations. Tree cover is mentioned as a problem preventing satellite imagery.
svnt
There is also mentioned the fact that according to their data estimates significantly improved from 2000 to 2010.
This could mean that apparent population growth during that period was much slower than it appeared, and instead down to improved measurement.
delichon
If this is true of the U.S. Census it would imply that congressional apportionment is biased against rural areas. So the question has a strong political valance and neutrality is not to be expected.
asadotzler
Rural Mississippi (O Brother, Where Art Though)
Delmar with companions: "You Wash's boy?"
Child with long gun: "Yessir and Daddy told me I'm to shoot whoever's from the bank."
Delmar: "Well, we ain't from no bank, young feller."
Child: "I'm also suppose to shoot folks servin' papers!"
Delmar: "Well we ain't got no papers"
Child: "I nicked the census man."
Delmar: "Now there's a good boy"
01HNNWZ0MV43FF
It would have to be quite a large bias to overcome the bias of the Senate
tshaddox
That's only relevant when comparing states with low populations and states with high populations, which is related but not quite the same as biasing rural populations over urban populations. California has way more rural population than Vermont, but Vermont voters are vastly overrepresented in the Senate (and the electoral college).
The bias towards urban populations would be more significant in congressional districts within a state, I reckon.
tdb7893
Are the datasets in the study calculated the same way as the census? It looks to be satellite data and remote sensing (which I'm not aware of the census using) so this doesn't seem to imply anything about the census
spookie
Well, this is the case in a lot of countries. On my home country each "state" has a number of representatives according to their population, and just the population. This has slowly but surely made it so laws are more and more biased towards metropolitan areas. From someone raised in a rural area it's mind boggling how far this has gotten. I guess this paper adds salt to injury.
mmooss
In the US, the bias is the other way: Effectively, the Constitution gerrymanders the US Senate in favor of rural populations. Every state gets two senators, so highly urban states with much greater population have fewer senators per person than rural states with much less population.
That also biases presidential elections, due to the way the Electoral College is calculated.
ty6853
Because the federal government was meant to be extremely limited and mostly serve to mediate _actual_ interstate commerce and foreign affairs. The oft ignored 10th amendment left most everything else to states or the people.
The system wasn't designed to fairly accommodate the gigantic social benefits, massive regulatory apparatus, and local spending apparatus the feds now get involved in. The people that designed the Senate thought that stuff mostly wildly unconstitutional. If they planned on that stuff being constitutional they'd have likely made the Senate more like the house in composition.
ch4s3
That’s not what Gerrymandering means, which refers to a process of carving up improbably shaped constituencies to create safe electoral seats by choosing the voters. The senate is just plainly 2 per state as a compromise to keep large populace states from steamrolling small or sparsely populated states. You may criticize that compromise but it’s not gerrymandering.
Loughla
That's why the house exists? Senate is equal representation for states. House is equal representation for people.
Also the presidential population is decided by delegates based on population.
What am I missing here? I feel like that's basic information.
newAccount2025
Why should laws favor rural regions instead of people?
spookie
I understand that, and the fact that basing yourself on population alone is simple and easy to understand.
But, your rural areas may be over half of your territory. People living there also matter, and the needs of those territories are very different.
Just to bring an example of how the current system backfires:
- we need railroads to reopen so we are able to sell produce to our own industry base competitevely
- we are always told there aren't enough people here, so no railroads. Even though, the primary objective of those wouldn't be to transport people
- we sell our produce to the neighbouring country instead, given it is closer than our own country's industrial hotspots. By selling off raw produce, our country will end up earning less than if it were processed by our own food industry.
- without much economic incentive to remain living there, many will emigrate.
- less people = even less representative power.
The cycle continues...
Now, I don't have a solution. It is a difficult problem to solve. One could be to have a higher minimum number of representatives per region, as the lowest populated region only elects 2, 24 times less than the highest populated one. And, just to show how different the political landscape is when compared to the US: the metropolitan regions have voted way more to the far right and far left parties than the rural regions have, which have led to a lot of divisive political outcomes.
aoki
It’s not.
A census literally counts every individual. Census public reports are aggregated for privacy.
This is about spatial disaggregation of aggregated census data. The problem in developing countries (which is what these datasets are used for) is that they often fail to run a complete census (in some cases, for many decades) because it is expensive and/or the government is not functional. So these datasets may not be well-calibrated overall.
The US has no problem running a decennial census, aside from nonresponse by immigrants, conspiracy theory enjoyers, etc.
InitialLastName
> A census literally counts every individual. Census public reports are aggregated for privacy.
Is there any evidence of how accurately the US census actually does that job? Having spoken with a number of people who were involved with the 2020 census (mostly on a volunteer, local basis) the answer is not "absolutely every person in the country got counted exactly once". There are a number of sources of error that would seem essentially impossible to fully remediate, due to people being complicated and error-prone and the census being largely self-reported:
- People in multiple households being counted twice, e.g. college students being counted both at college and at home.
- People who refuse to participate (one of the census-takers I know had someone wave a gun at her face when she tried to follow up on their household's non-response).
- People who are transient or disconnected enough from the social fabric that they are hard to maintain consistent contact with. I knew someone involved with taking the census for an urban unhoused population and there was no doubt that they both missed and double-counted people; I'm sure it's even more difficult for rural populations that might be even less connected and even more transient.
It's not hard to imagine that, on net and despite there being rules and processes for handling each of those situations, there ends up being regional and/or urban/rural bias in the final counts.
Aloisius
> Is there any evidence of how accurately the US census actually does that job?
Yes. The Census uses a post-enumeration survey for this purpose. They found evidence of significant under- or over-counting in 14 states for the 2020 census.
That was considerably worse than the 2010 census, though covid might be partly to blame. Still, afaik, there wasn't anywhere near the rural undercount described in the article.
https://www.census.gov/library/stories/2022/05/2020-census-u...
DennisP
US rural people tend to trust the government less, so I could definitely see them getting undercounted.
The census should really have an option for refusers that collects an absolute minimum of information. The Constitution only asks that people be counted, not that they be broken down into categories.
At least it's just the short form now. Prior to 2010, one out of six people got the long form, which had ten pages of questions for each person in the household. Here's a pdf: https://www.census.gov/content/dam/Census/programs-surveys/d...
mootothemax
> Having spoken with a number of people who were involved with the 2020 census (mostly on a volunteer, local basis) the answer is not "absolutely every person in the country got counted exactly once".
Do you have any thoughts on how the US census takes your insightful comment's points into account?
I imagine it's got to be quite an involved process given the vast differences in US geography, kinda blows my mind thinking about even basic stuff like age demographics vs. taking into account how many people died on census day.
But then - maybe that's too granular and it all balances out in the end? Or at least, it does if you use special magic sauce the US census has covered?
Spooky23
Trump 45 admin did not prioritize the census and their ramp up wasn’t particularly good as compared to past years. In my city there was like 30% fewer enumerators.
In the US, rural undercounting isn’t really a problem politically, although it has negative impacts on revenue allocation that is population based like sales taxes.
The biggest issue is that poor residents are underreported. This is both an urban and rural issue but from a numbers perspective impacts urban areas more.
mootothemax
>A census literally counts every individual.
I'm afraid that this isn't the case; a census is a best-efforts estimate.
There are plenty of people missed for a variety of reason, everything from not wanting to be found, through to simple avoidance. Let alone filling out the forms incorrectly or giving dud answers to the army of amazing people trying to make sense of all the madness.
Edit: realising the above has added to a long list of things that once upon a time I thought were hard-set facts, and nowadays I'm slowly losing my mind over. Coordinates, populated place boundaries, census counts + demographics... I mean, what _is_ an address? Incredibly painful to get to the bottom of that one, at least in the UK where the definition of an address will vary significantly by use case
screams into the void
Mountain_Skies
A census worker wasn't able to get in contact with the people who lived across the street from me. They suddenly moved out one weekend during the first summer of the pandemic and never came back, leaving the house vacant. I didn't know them very well, but the census worker asked me to make my best guess about the ages of each of the adult and child who had lived there.
Did that cause a double count due to them being counted wherever they moved to? Were my guesses correct? Or correct enough? No idea but that's the data the census worker recorded.
senkora
The census has an official date to avoid issues like this. Since they lived at the house across the street from you on the official census date of April 1st 2020, the census worker was correct to record them at that address.
They would only have been double counted if they incorrectly reported living at their new address as of April 1st.
null
ericmcer
There is just no way they count every individual. I live in California and was visiting someone in the central valley. For fun I drove some back dirt roads into some hills and there was a small corrugated sheet metal building style settlement with 20ish people hanging around it a ways back in. It really made me wonder how many people are tucked away. Not like a 24 year old kid who gets a college degree and intentionally lives off grid, but poor/uneducated people who eke out a living somewhere off the beaten path.
null
xbmcuser
It could be true but as someone from Asia the biggest thing that came to my mind was corruption. Village head adds 5-10% to his village population to garnish the compensation the process continues for the next 2-3 steps by the time it reaches the payment time you get a count of 20-30% higher than the actual population
pbhjpbhj
Most censuses have sampling of responses to verify and have post-enumeration surveys. If that were a widespread activity then it would get found (assuming political/administrative will to do so).
commonsenseman
Did nobody suspect that its not that rural population is underrepresented but the people who want compensation are over represented? There was a dam built in the mountains a few hundred km away from my city and even here people applied for compensation by hook and crook, some enterprising folk specifically installed garages or sheds or hunting cabins or summer houses in the flooded area to receive it.
null
null
varenc
Is there an incentive for fraud during dam construction re-settlement? Like if a family home is being displaced, it might benefit the family to list all possible relatives on the re-settlement even including ones that don’t actually live there.
instagraham
If population level estimates are systematically off, does this also skew any prevalence data or ratios derived from these datasets?
For example disease prevalence, lifestyle condition differences between rural and urban, etc
modeless
Is there compensation given to people resettled during dam construction?
pixl97
Given to people, or given to property owners?
bee_rider
Surely it depends on the jurisdiction
ajmurmann
And theory and practice might vary. I saw a heartbreaking segment in a documentary about the Three Gorges Dam about a very old couple who were supposed to receive compensation or replacement for their home being flooded. It never arrived. So we got to watch these two people in their late 70s or 80s carry their old home brick by brick up a steep mountain side.
aziaziazi
Return to Dust (隐入尘烟) - 2022 has some similarities but won’t say more (no spoil).
The movie is great and China censorship story worth a glance. There no direct critics but it depicts some collaterals of rural areas transformations. Also poetic and contemplative.
phtrivier
Not that I want to riff on the "in mice" joke, but looking at figure 2, it seems like the title should read "systematically underrepresent rural population _in China_", right ?
(Some other countries are included, but it seems like China, or at least south east Asia, provides the bulk of the data.)
Which is not nothing - if China's population is underestimated, even by a small amount, that means a lot.
But there might be purely historical reason ("ghost kids" unreported because of the "one child policy" come to mind) that would not apply to, say, the US census, or the size of constituencies in Europe that would undermine right wing parties.
OfficeChad
[dead]
Because of the nature of the data (resettlements for dam construction) two thirds of the data is from China. But interestingly according to Figure 7 [1] the discrepancy exists even in countries with normally very meticulous record keeping, like Sweden and Germany. I find this surprising.
The root cause analysis also falls short for those cases: Germany's bureaucracy might be underfunded but fundamentally requires every resident to register at their place of residence. There is also no large-scale conflict or violence in Germany, no regions are really remote or hard to reach, all rural areas are electrified (important for satellite-based night-time light counting by GRUMP and WorldPop). The only satisfying root causes left are about satellite counting methods being too coarse or badly tuned to accurately count rural area, or rural living patterns possibly not fitting the assumptions of these models. But it's weird that these errors are so similar between so many different methods, not all of which even use satellite data
https://www.nature.com/articles/s41467-025-56906-7/figures/7