StarDict sends X11 clipboard to remote servers

337 comments

·August 12, 2025

sugarpimpdorsey

> In response, Xiao pointed out that the package description can be read by any user who chooses to install the software, and it does mention the scan feature.

Wouldn't be the first (or last) time a Debian maintainer has pulled the "you should read the descriptions of all (hundreds) of your packages (most installed as dependencies)" card in response to a bug report.

If someone started reading all the package descriptions and READMEs we're meant to be thoroughly familiar with when Trixie was released a few days ago, they'd still be reading them.

jraph

“the plans and the demolition orders have been on display at the local planning office on Alpha Centauri for fifty of your Earth years. If you can't be bothered to take an interest in local affairs...”

https://www.youtube.com/watch?v=Z1Ba4BbH0oY

tomsmeding

For the uninformed: this is a quote from The Hitchhiker's Guide to the Galaxy.

jraph

Thanks for giving the reference right here. I should have in addition to the link!

wolfi1

[flagged]

Sesse__

You mean, for those who couldn't be bothered to click the link under a joke.

jacquesm

Such responses to me are proof of malicious intent.

avhception

While I think the response was not well thought out, it's still a far cry from "proof of malicious intent".

jacquesm

We're not going to agree on that. The response is clearly there to point to a fig leaf instead of saying 'oh, oops, we will make this more obvious in the UI', the software is working as intended: as a way to gain access to more data.

Note that clipboard data can be just about anything and is a valuable dataset, more so if the source of the data isn't aware of being a source, besides, there is no history so you won't even know what you've lost.

JumpCrisscross

> it's still a far cry from "proof of malicious intent"

Is the difference meaningful? It’s proof of a value set so different from the community’s as to merit the same response: expulsion.

rangerelf

I disagree; it's basically lawyerspeak for "sucks to be you".

If one is expected to go through all the documentation of both the main package and all dependency packages, and also through whatever specific configuration details to your case, just to be able to catch a specific IMPORTANT detail that's not clearly spelled out in the main package, that's malicious.

"A dependency we use captures your clipboard data and sends it to remote servers"

That sentence right there would kill their userbase, so they omit warning you about it. And on top of the "...user should have read the description..." non-apology, "just split the packages, bro".

That's malicious.

ASalazarMX

It's clearly a defensive excuse, as it is extremely unrealistic to expect final users to read all the docs of all the dependencies of a Linux distro. It's the responsibility of the maintainer to read the subset of docs relevant to the package(s) they're contributing, not the user's.

It could be that they were caught with their pants down and posted an ill-thought response, but I'd lean strongly towards malice with such a poor defense, it borders on confession. Clipboards are one of the most critical privacy/security features, you don't ever want to leak them unintentionally.

Did we already forget about the XZ Utils backdoor? There have to be multiple efforts to infiltrate backdoors in Linux going right now.

https://en.wikipedia.org/wiki/XZ_Utils_backdoor

account42

We can't afford that level of benefit of the doubt for the people that are supposed to guard us from exactly this kind of bs.

Intent or not, that developer is a risk to the project.

npteljes

Hanlon's razor applies here, I think. It's just ignorance, not malice. I doubt the maintainer has connection, or was pressured by these two random dictionary websites to include this - nor do I think that they gain any advantage of it.

People need to be on the lookout though, the xz incident showed that FOSS is indeed vulnerable.

poemxo

I think Hanlon's razor is outdated. Plausible deniability is the new meta. On top of that, the maintainer seems intent on not fixing the problem.

blackhaz

But it cannot be adequately attributed to ignorance, so no, Hanlon's razor does not apply. There is an obvious security breach.

chuckadams

Sufficiently advanced ignorance is indistinguishable from malice.

(but malware authors usually cover their tracks better)

vorgol

> pressured

Maybe incentivized? $1000? $10000? Would be interesting to hear from the developer himself.

dingnuts

guy works for a Chinese media company and he's essentially trying to slip a backdoor into Debian systems.

malice & typical CCP behavior IMHO. The responses from the maintainer are unacceptable and he should have his privileges stripped

frumplestlatz

Willful negligence is, at some point, malicious.

more_corn

No. The simplest answer is that they’re deliberately and maliciously exfiltrating data. The other explanation requires more hoops.

Lockal

There are dozens of chrome extensions that translate (read: submit to untrusted server) on hover / highlight / context menu / textarea edit / etc. It is implied, that user acknowledges this functionality and accepts the risk. This includes untrusted server (because that's how they proxy requests to Google/Bing/Yandex Translate without exposing API keys).

Security illiteracy? Yes. Malicious intent? Probably no.

Does being security illiterate equal malicious? Debatable.

jeltz

Not sure if I would call it malicious but I would call it gross negligence.

johnklos

No reasonable person expects privacy when using Google and/or Google provided products / software.

When you use Debian, you have a reasonable expectation of privacy.

People who handwave that away or say it's not as bad as something else either have an agenda or are ignorant about the history of Debian.

oblio

A moderately popular Chrome extension is frequently bought for tens of thousands of dollars for various purposes, frequently malware injection. They contact extension makers.

I think the bar for trust in terms of evil intent is on the floor.

DonHopkins

[flagged]

thegrimmest

Why can't reasonable people disagree here? Surely if the utility of some features might outweigh the security concerns for some people. Making features opt-in instead of opt-out significantly changes their discoverability and usage metrics. On the whole, a translation system that has a feature to translate selected text seems hardly surprising. Similarly, using an online service to improve translation quality and reduce local resource usage also seems reasonable.

Fundamentally, always-online, home-phoning features are the norm, and it should be up to OS distributions to manage security postures such as allowlists for network access. Think something along the lines of "StarDict wants to connect to dict.cn. Allow/Deny?".

pabs3

> Think something along the lines of "StarDict wants to connect to dict.cn. Allow/Deny?".

That is what opensnitch provides, as do some other detection tools.

https://wiki.debian.org/PrivacyIssues#Detection_tools

foresto

> Why can't reasonable people disagree here?

They can, but framing this as a mere disagreement is disingenuous: One approach might slightly inconvenience someone, while the other (as was taken here) inflicts irreparable damage.

> Fundamentally, always-online, home-phoning features are the norm,

No. Although common on certain platforms, they are not a fundamental norm in software, nor should they be.

In particular, we're talking about Debian here.

rusk

Such a response is not considered a valid defence under GDPR. You cannot sign away your right to privacy any more than you can sign away your right to life.

JumpCrisscross

> You cannot sign away your right to privacy any more than you can sign away your right to life

You can literally do both in the EU with informed consent.

sim7c00

i agree. if in 2025 ppl dont understand plaintext of user data to places on the net is bad, they should not write code nor be maintainers of oss software -_-.

how many times does everyone need to be totally compromised by some shitty software before people start to care?

innocent individuals each days are suffering hacks and malicious interactions. people are losing their livelihoods. companies are getting shutdown... what more need to happen?? :S

thewebguyd

> i agree. if in 2025 ppl dont understand plaintext of user data to places on the net is bad, they should not write code nor be maintainers of oss software -_-.

LLMs are only going to make this worse. We're going to see a plethora of vibe coded slop everywhere.

CorrectHorseBat

Malicious intent written in the package description? I would think that really unlikely.

I think it's just a cultural difference. Sogou, a super popular Chinese input program for Windows iOS and Android does the same with everything you type and nobody cares.

jacquesm

I'd say that having terms of service that document your shady behavior whilst at the same time not making this obvious in the UI in any way is a tried and true (corporate) malware pattern.

Just because Microsoft did it that doesn't make it a valid defense, in fact it shows the opposite (after all, they too did not have the best interests of their users at heart). The fact that the recipient of the data sits on the other side of the GFW and that clipboards can contain very interesting data you really should wonder about the intentions of the author, they do not get the benefit of the doubt. In fact, open source software that to all intents and purposes looks like it runs locally but pumps your (private) data out without your consent is a very large red flag to me: it gains access to data that otherwise likely would never be found in the wild. At a minimum this is a fairly serious GDPR violation.

npteljes

I think so too. It's cultural difference, and ignorance at most. I doubt the maintainer has control over that two random dictionary websites, or was tasked by them to do this or anything like that. They are just a different person, and they didn't give a fuck.

chainingsolid

I install stuff from Debian's repos for 2 reasons. Convience & trust. And while people do complain when maintainers modify packages behavior, I think people would rather have the send my clipboard contents to someone else to be opt-in. Instead of violating their trust!

zahlman

If this level of modification is required for a package to fit in with the distro's philosophy, maybe better not to include it at all.

m463

I think the answer might be to codify some of these assumptions.

It might help set things apart from say ubuntu, which doesn't engender the same amount of trust such as opt-in.

fodmap

I do agree with your point, specially when it is not the first time a package maintained by that guy does non-expected behavior like https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1010165 (Inappropriate package, modifies other package's (conf) files, should be removed from archive).

tremon

I disagree that "modifying other packages' conf files" is a problem. Conf files in general are there for the user to modify, and as the maintainer points out, it shouldn't matter if the user uses vi, joe, emacs or this specific tool to modify them.

The problem in this case is that the package modifies generated files belonging to another package. Making it about conffiles is bad phrasing by the bug submitter.

cesarb

> If someone started reading all the package descriptions and READMEs we're meant to be thoroughly familiar with when Trixie was released a few days ago, they'd still be reading them.

That used to be viable back in the late 1990s and early 2000s when I first used Debian. It would take an afternoon of going through all the packages in dselect (does anyone here still remember dselect?) and marking the ones you wanted to install, and around the same amount of time going through every option on the kernel's menuconfig to precisely tailor the kernel to your specific hardware configuration (things were much less dynamic back then).

Nowadays, there are simply too many packages and kernel configuration options to go through (also, does anyone still use dselect?).

juujian

Also, someone looked at the package and the description, that is why this issue has been raised.

wat10000

That doesn’t even address the problem! The package description does mention the scan feature, but not the automatically-send-it-to-a-server-in-plain-text feature.

Sure, if you read the description and the list of plugins and correctly guess how this plugin is implemented, then you can deduce some of it.

bayindirh

"RTFM!" comments comes in flavors and bears nuances. In this case, as another commenter has pointed out, the answer smells fishy.

I have been told to "RTFM!" countless times in many places. Some of them were legitimately the correct answer in that context, in hindsight. Some were knee-jerk reactions like this.

Debian's discussion culture might be a little edgy sometimes, but this has nothing to do with Debian.

CamouflagedKiwi

> of course a dictionary program will include code to talk to dictionary-providing web sites.

I wouldn't say that is just a given, if I've apt-get installed a dictionary I might expect that is the whole thing on my machine. It's not like we haven't had dictionaries in physical books for centuries... It seems like stardict is very much an online thing, which I suppose could be legit, but the whole thing does seem like a trap.

kazinator

I's a generational thing. I would guess that someone who expects applications to phone home, on the off chance that they are actually otherwise local, is likely someone pretty young who hasn't lived in a world of locally installed software that doesn't talk to anything.

If we search for the author's bio, that seems to check out. They are a well-credentialed CS person; obviously they know that dictionary programs such as translation pop ups can have offline dictionaries, and mentions that. But they are a person of their time with an according set of "of courses".

Today, an application being locally installed and works with offline data is like a a statement of quaint chivalry, promulgated by a few remaining Don Quixotes of computing. (It saddens me to say. So much that this analogy brings me insufficient amusement.)

yorwba

For many languages, there simply isn't a comprehensive dictionary file that could be redistributed legally as part of a free-software offline dictionary application. You either settle for a few thousand words put together by a handful of volunteers, or you redistribute a commercial dictionary illegally, or you have to connect to an online service to provide sufficient coverage legally.

piperswe

Wiktionary is massive with 1.4M English entries [1] (3x the size of the Merriam Webster's Unabridged dictionary [2], though with a lower average quality), and CC-BY-SA-licensed[3]

[1]: https://en.wiktionary.org/wiki/Wiktionary:Statistics [2]: https://www.merriam-webster.com/help/faq-how-many-english-wo... [3]: https://en.wiktionary.org/wiki/Wiktionary:Copyrights

zamadatix

I could buy the idea of the plugin system itself being desired (e.g. maybe I even want english definitions from Merriam-Webster or something because I like their style more than the open source database) but I think that's separate from what an app does by default. Especially on something like Debian, one should expect a FOSS-first approach whenever reasonable, and for >99% of users the reasonable default is a local dictionary.

null

[deleted]

pxc

Dictionaries are small! It's insane to think that a dictionary requires network access. If it did, why would I install it locally??

> Today, an application being locally installed and works with offline data is like a a statement of quaint chivalry, promulgated by a few remaining Don Quixotes of computing.

But a dictionary package has no valid reason to be online.

ryandrake

Wouldn't someone's expectation instead depend on the nature of the application, and what data it needs? My expectation is that an application does not access the network unless it requires a resource only available from the network. I would totally expect a "Yelp" application to make network requests as part of its core functionality. Yelp is an online service, and in order to use it, you have to talk to the network, and you're generally requesting data that might often change, so you need fresh copies. Same for an Internet browser, or ftp or git (for remotes) or things like that. I would not expect a spell checker to need to access a network because it can all be done locally and the spelling of words doesn't change often enough to need a fresh dictionary from the network over and over. And I certainly would not expect the software to send data to the network. I would also not expect a calculator application to request math function from the network or send my equations to a network service so that the network service could provide a result.

jcelerier

> I's a generational thing

... Is it? Dictionary apps have been working like this for more than twenty years. Babylon Pro of which stardict is pretty much a clone was doing this with already millions of users in the year 2000! Kindles work like that!

hdjrudni

Even if it's "legit", it shouldn't be using unencrypted HTTP.

sam_lowry_

Why? Should it use the dict protocol, then?

rootnod3

How about HTTPS?

mattmanser

Because without HTTPS it's trivial to MITM that clipboard content if they're always sending it via http.

People in your coffee shop on the same WiFi could read it.

I get some people don't realize that's how TCP/IP works and the firesheep stuff all happened 15 years ago. But a bit worrying to see a frequent HN contributor challenging that.

That's why we now push for Https everywhere.

account42

That stood out to me as well. It's a sad world when people expect even simple functionality to be a live service.

pantalaimon

The venerable ding does well with a local dictionary - and it's packaged in Debian too

https://www-user.tu-chemnitz.de/~fri/ding/

mkesper

But only english-german, sadly

mayama

At some point I started running gui apps without network access, first with firejail and then bubblewrap. This was before flatpak became a thing. I still use collection of bash scripts that built up over time to run applications in sandbox.

waterhouse

  ~> wc -cl /usr/share/dict/words
  235976 2493885 /usr/share/dict/words

One might even expect a program to use a common Unix preinstalled dictionary.

dkiebd

"words" is nothing but a list of words. It does not contain definitions for those words, which is what one expects from a dictionary.

waterhouse

Hmm, you are correct.

delfinom

I wonder where one files a bug report that it's misusing "dict" under "words"

null

[deleted]

yjftsjthsd-h

Dumb question... Could you do a per-word bloom filter to do online spell checking without actually disclosing the words you're checking?

markasoftware

a bloom filter look up is by hash, and given the relatively small set of words in english, it would be pretty easy for the server to reverse the hash sent to it. Thus a bloom filter wouldn't be very private.

Additionally, a typical spell checker feature is to provide alternative, correct, spellings, rather than just telling you whether a word is correctly spelled.

I bet there's some cool way to do this with zero-knowledge or homomorphic cryptography though!

notpushkin

There’s also a way simpler way: send a hash prefix to server, get a list of matches. Google Safe Browsing does this with URLs, for example.

shakna

You should be able to do a K-means type thing. Where your query is an entire group, and you grab the field from the chunk locally.

But you might still be able to use some frequency sampling to predict the words used, unless those chunks are very very carefully constructed.

Sesse__

> a bloom filter look up is by hash, and given the relatively small set of words in english, it would be pretty easy for the server to reverse the hash sent to it. Thus a bloom filter wouldn't be very private.

The typical use of a Bloom filter is to have it locally as a prefilter, not to send hashes to the server.

account42

> I bet there's some cool way to do this with zero-knowledge or homomorphic cryptography though!

The code for which would almost certainly be larger than a fully local dictionary for any human language.

bmacho

> a typical spell checker feature is to provide alternative, correct, spellings, rather than just telling you whether a word is correctly spelled.

I personally don't use that one, for me the red underline is enough.

There are two scenarios I believe, first accidentally sending a (decent) password, and second the server not learning what you actually look up.

For the first case, sending a hash would prevent the server from learning a password that is not in the dictionary, something like password5 would hash to gibberish.

For the second, the server needs to know what to actually send back. I believe Google's malicious website check works (or used to) by truncating a hash an then just sending the answer for some 128 or so websites and have the browser figure out which of them the user wanted to visit. That creates some deniability over witch website you actually visited and should be also usable to prevent the server from learnering what you actually looked up.

So yes, I think you could design a more secure Protokoll. Though general security disclaimer the people trying to read your letters probably spend more time attacking than I spend writing this post.

CGamesPlay

Just want to mention that the feature in question here is for translation, not spell checking.

null

[deleted]

paffdragon

Somewhat related, I was quite surprised when I discovered that my Samsung phone was sharing ALL my clipboard with all my other Samsung devices, including passwords copied into the clipboard, and even preserving the history. I can't remember if the sharing was enabled by default or I opted in by accident. I assume it also goes through their servers to reach my other devices. I could disable the sharing, but still can't turn off the clipboard history, even switching to a different keyboard, the Samsung keyboard still captures the clipboard and saves the history, when I switch the keyboard to Samsung everything is there... I guess my next phone won't be Samsung.

dannyw

Yes, and we know at least Samsung TVs sell your details and what you watch to marketers and everyone.

Samsung’s privacy policy is the same for phones and TVs.

yonatan8070

I noticed this happening through KDE connect, where passwords copied on Linux show up in Android's clipboard history, is there a way to block passwords from being transported around like that without completely disabling clipboard sharing altogether?

fhcbix

KDE connect lets you disable/configure individual plugins, just disable the "Clipboard sync". I don't think it can by itself figure out that you're copying a password, at least across UI toolkits. FWIW most toolkits and browsers don't actually copy from a password input anyway.

pabs3

You might want to disable KDE's clipboard history too btw, or at least find a solution that doesn't involve copying passwords. Either use the selection instead of the clipboard, or use password filling through non-clipboard channels.

nullify88

I usually suggest not to create or login with a Samsung Account on Samsung devices. It's just another opportunity for a company to get at your data.

xpressvideoz

[flagged]

paffdragon

> You're just an impatient paranoid, easily jumping to conclusions. You should be ashamed for spreading false information.

Thank you for your kind words, please look at the HN comment guidelines when you have a chance. Your point would have been an excellent correction if shared thoughtfully, but it's all negated by the name calling and personal attacks.

olejorgenb

IIRC, the GP claimed clipboard sharing between Samsung devices only happens over the local network

Elucalidavah

Querying a local dictionary on each clipboard seems okay; having a feature to request remote dictionaries is okay; making it easy to combine both is dubious but understandable (would be better off as a special flag); but having them combined by default? That's pretty much malicious.

maxglute

It's talking about querying youdao, which is more translation service. Offline translation < online translation, i.e. I don't want to fallback to local google offline translate language package unless I have no data. I don't use stardict, but it should be completely expected functionality if translating more than words like dictionary.

This entire article should be, Chinese translation program sends clipboard data to it's own website and chinese translation services, but on http.

CorrectHorseBat

[flagged]

dd_xplore

It's malicious intent! The developer isn't a kid, they're releasing the software for world wide use. It's a simple thing, do not send private data to remote servers without explicitly asking the user!

blackhaz

I'd go one step further and say it's a blatant Chinese SIGINT.

CorrectHorseBat

In your eyes maybe (and mine for the record), but different people have different values and expectations of what is privacy.

exe34

That's like saying Afgans have a different idea of consent.

komali2

Not really because "Chinese" is being used here as an indicator of nationality, not ethnicity.

I disagree with using it that way because it feeds into the CPC's propaganda mission to conflate the ethnicity of "Han" with citizenship of the PRC, which aids their cultural imperialism ('Taiwan is "Chinese" and we are "China" so therefore people in Taiwan are our people!'). Also the definition is being stretched to include anyone with even the vaguest ethnic ancestry from within territory ruled by the PRC or historic empires ("China" is a word that basically means "empire")

Anyway I agree that people from the PRC are more used to throwing up their hands at invasions of privacy since the government having total insight into your life is a given there, and to many a positive thing (they may believe it keeps them safe). I also believe that growing up as one of one billion people gives one a sense of useless anonymity - who cares if someone sees your clipboard, there's just too many people for it to matter.

jeroenhd

There definitely seems to be a cultural difference when it comes to privacy expectations from Chinese companies and western companies. Doesn't mean it's okay to do this kind of thing in a Debian package, of course, but I can understand how this could've happened.

eadmund

The Wayland framing at the end strikes me as misleading. This gets it exactly right:

> Or maybe StarDict would have started asking for special permissions to let it work on Wayland, and users would have accepted those defaults the same way they currently do.

Yes, that’s what it would do. Its installer might even configure that special permission automatically, without user intervention.

Malware’s gonna mal. Wayland might help defend against some things, but it’s not going to defend against packages installed as part of the distro.

heresie-dabord

It is not misleading, Wayland is better than Xorg in this particular respect.

But the other concern is part of the systemic problem. Consider that the data that was transmitted was sent in the clear!

> StarDict ... while running on X11, using Debian's default configuration, it will send a user's text selections over unencrypted HTTP to two remote servers.

> Any user who did read the description of the package, and who knew what the YouDao plugin would do, might nevertheless expect the resulting communication to at least be encrypted. But the plugin actually reaches out to its backend servers — dict.youdao.com and dict.cn — over unsecured HTTP. So, not only are these servers sent any text the user selects, but anyone who can view traffic anywhere along its path can see the same thing.

kelnos

It's extra misleading, because "Wayland" isn't a thing when it comes to policy like this. Unless a compositor implements some sort of user approve/deny UI when an app requests access to the clipboard, apps on Wayland can snoop on the clipboard just as easily as on X11. I haven't run GNOME or KDE in Wayland mode, so maybe they do implement something like that, but none of the wlroots-based compositors I've tried do.

CGamesPlay

It's really difficult to not assume malice with something like this. From the maintainer:

> The stardict has "Scan" function, when user enable this function, after user select some text, it will trigger stardict do translate for this selected text... Why the user selects some confidential data to query dictionary?

netsharc

Would be funny if they couldn't tell that the text in a foreign language is confidential... maybe it's stamped "秘密".

"Sir, we have intel, the enemy is having translation server errors."

hiAndrewQuinn

>This would normally not be much cause for concern; of course a dictionary program will include code to talk to dictionary-providing web sites.

Hey, an area I finally know something about. It depends on what you're trying to do.

The slimmed down version of a Finnish dictionary I provide in `tsk` [1] weighs in at around 30 MB, for about 250,000 Finnish words. It's small enough that I embed the whole dictionary directly into the binary and reconstruct the prefix search on the fly every time the user starts the app.

However, the much larger database which contains things like lemmatization and etymology information easily balloons up to many, many gigabytes in size. My problem domain is providing Truly Instant Lookup, keystroke by keystroke, so I can't really get around this level of memoization. The work to figure all this out was sufficient that I decided to make future versions a paid product instead [2].

Most other use cases would just call out to a server, because it's silly to think most people are going to download a giant database for that use case alone. A hybrid approach could also make a lot of sense, eg cache the most common 10,000 words locally and call out for the next 1.5 million, which are statistically extremely rare.

[1]: https://github.com/hiandrewquinn/tsk

[2]: https://taskusanakirja.com/ (offline for now until I get Digicert to certify my downloads wholesome for Windows resale)

avhception

While I have a lot of respect for the effort that goes into Debian, I always disliked this kind of "maximalism" from the package manager. Oh, the user wants "foo"? Let's install every software that might be even remotely useful somehow in combination with foo! Oh there is a network daemon in there? Fantastic, let's start it immediately!

I know that there is a flag to disable the installation for "recommended" packages. I just think the default is a disservice here.

bayindirh

I'll politely disagree.

First of all, "Recommends" is reserved for packages which enhance the functionality of the package you're installing. Without these the package will not break, but some very useful functionality might be disabled.

The package-class you're talking about is "suggests", IOW, "these packages might also be useful for you, wanna look?" section. These are not installed by default already.

On the other hand, apt and aptitude provides previews before doing something. You don't have to accept them. In aptitude's case, you can fine tune before the final commit, even.

There's a tension. Minimalism vs. user utility. Somebody told in Debian 13 release comments that "Debian will never be a end-user friendly distro". Now, you're saying that packages shouldn't install recommends by default.

What should Debian be? "An IKEAesque DIY distro", or "A more user friendly, yet very stable and vanilla distro". I vote for the latter, personally. Plus, as I told before, advanced users are free to use what they want to change.

If you want to change the default, the configuration files are at /etc/apt/conf.d/. If you want to disable feature for once, it's --no-install-recommends.

avhception

Well, as a user of one of the more "IKEAesque" distros, I guess I have made my choice ;)

And that's perfectly fine, it just means I don't align with Debian on this one. And that freedom is what Linux is all about, I guess. So it seems it's working as intended :)

Edit: And I totally get that users might often want that kind of maximalism. It's just not for me. Although starting network daemons by default might sometimes be a bridge too far, or the case described in the article here.

bayindirh

While I'll argue that Debian's network daemons come with very sane defaults and an accompanying AppArmor profile to prevent both network disruptions and attack surface increases, I'm certainly not with the developer of StarDict. That thing smells malicious.

...and this is what Debian Testing is actually for. To catch these types of issues.

Of course, people are free to select what they resonates with them. I'm not against more DIY distributions (I'm also contemplating using a LFS VM to explore things even further, but time is an issue), and I'm not against your personal choices. I just wanted to note the tension, and share my observations about Debian.

account42

I agree that recommends makes sense but this is a bullshit argument:

> On the other hand, apt and aptitude provides previews before doing something. You don't have to accept them. In aptitude's case, you can fine tune before the final commit, even.

You can't expect the average user to understand the entire dependency tree and read the description of dozens of random packages that the average program pulls in. RTFM is not a valid excuse for bad defaults.

bayindirh

I don't expect average user to read an entire dependency tree. However, apt and aptitude does a relatively good job of explaining their actions' reasons.

Let me rephrase:

    1. Installation of recommended packages is a good default for the average user, because it provides functionality they expect.
    2. If the user is not happy with what's happening, changing defaults are not hard.

IOW, if you don't like how your system behaves, read the documents. Otherwise, I argue, current defaults is good for the benefit of the newcomer and average Linux user. If you are at a point where you are caring which package is doing what, you're leaving "average user / beginner" realm.

In the case of StarDict, as I noted elsewhere, I think the developer's answer is fishy, or ill-informed at least.

ethan_smith

This is a classic tension between convenience and security - Debian's "recommends" defaults were designed for a pre-cloud era when network connectivity wasn't assumed and local functionality was prioritized over potential security boundaries.

barosl

Actually the default value of `APT::Install-Recommends` had been false, and it was changed to true in Debian 6.0 Squeeze (2011-02-06). I didn't like the change at the time because my Debian and Ubuntu systems suddenly installed more packages by default. However, now that I think of, the distinction of recommended packages and suggested packages was blurry before the change, because both were opt-in. Auto-installing recommended packages, while allowing the user to opt out is a better default I guess. But I still turn off auto-installation of recommended packages in the systems I manage.

tremon

I don't have a problem with --install-recommends being the default. I think it's a fine distinction to have Recommends be "most of our users will want these" and "this package provides some niche feature that most users won't need".

However, like you, I do have a problem with maintainers abusing the Recommends: field to further their own world domination plans. There is no valid reason that installing an archive tool should mandate a specific init system (looking at you, file-roller and gnome team in general).

account42

The other extreme where you are missing expected functionality because it's optional isn't any better. The problem is not that recommended dependencies are installed by default, it's that package recommendations should perhaps be more conservative. Note that Debian already differentiates between recommended dependencies (which most users should want) and suggested dependencies (related functionality or enhancements that are not relevant for every user).

rfoo

For me it's my most used super long command line flag.

For a brief moment `--break-system-packages` surpassed it, then I discovered `pip` accepts abbrev flags so `--br` is enough, and sounds like bruh.

IshKebab

> --break-system-packages

You can avoid that clusterfuck using `uv tool install`. E.g. `uv tool install pre-commit`.

zahlman

It's also not hard to just manage a damn virtual environment yourself.

bmacho

Am I the only one that gets incredibly angry when I read things like this? This is unacceptable on every level.

pcdoodle

You're not alone. He/She needs a pi in the face bill gates style.

pabs3

There are numerous privacy issues in distros, some known, most probably unknown, some examples from Debian:

https://wiki.debian.org/PrivacyIssues

Luckily there are things like opensnitch that can block some of these issues:

https://github.com/evilsocket/opensnitch

account42

Your link is about privacy issues in upstream software that Debian hasn't sufficiently worked around yet. The main advantage of the Distro model (as opposed to developer-maintained package ecosystems) is exactly that there is someone protecting you from questionable software "features".

pabs3

Agreed, but it is definitely not enough, which is why some Debian folks packaged opensnitch.

amiga386

I don't think Debian intentionally shields you from privacy-invading software. Other distros may differ on this point.

Debian does not mandate anything about privacy in its Policy Manual (which are the standards for selecting and packaging software that maintainers must adhere to): https://www.debian.org/doc/debian-policy/search.html?q=priva...

There's also no insistence on privacy in the Debian Social Contract or DFSG (not that these would be appropriate places for it, they're mainly about licensing)

pabs3

> I don't think Debian intentionally shields you from privacy-invading software

There is a culture of valuing privacy though, including patching out privacy issues. Especially since a lot of Debian folks are from Europe, with corresponding GDPR knowledge.

I know that the lintian warnings pointing out privacy issues in HTML documentation do get a lot of patches.

Also, opensnitch is packaged as a mitigation.

You are right about the policy problem, Debian really needs to do something about that.

There is at least a privacy policy for Debian services.

https://www.debian.org/legal/privacy

fsflover

> I don't think Debian intentionally shields you from privacy-invading software.

Don't they change the Firefox defaults for more privacy?

GrayShade

Who protects you when the packagers decide to trust a shady CA (adding it to the root store) because it's used by the distro's infra?

account42

Is this supposed to be some kind of gotcha argument? Against what?

graemep

That is interesting.

There is nothing in that list anything like as bad as this. The next worst is Chromium which is no surprise.

fsflover

Are you saying it's an ordinary behavior? There's nothing coming close in your links, especially in Debian.

pbohun

I don't understand why the whole thing isn't local. A comprehensive Chinese dictionary has less than 400k words. Even at 1k per word that's less than 400MB.

It's just poor design to make something require a network connection when it could work offline locally.

slanterns

Then we should have a copy-left dictionary first.

blackhaz

If I would be deciding, I would kick-ban StarDict immediately from the distribution, and scrutinize i) the maintainer for all the packages he has ever touched, ii) StarDict authors for allowing such a default behavior in their system.