Skip to content(if available)orjump to list(if available)

Rsync replaced with openrsync on macOS Sequoia

adrian_b

Looking at the sparse documentation of openrsync does not create any confidence for me that it can be an acceptable substitute for rsync.

In my opinion, any program that is supposed to copy files, but which is not able to make perfect copies, i.e. copies that do not lose any bit of data or metadata that was present in the original file, is just unusable garbage.

Unfortunately, most copying programs available in UNIX-like operating systems (and also many archiving programs) do not make perfect file copies with their default options and many of them are never able to make perfect copies, regardless what options are used.

I have not looked recently at the scp command of ssh, but at least until a few years ago it was not possible to make perfect file copies with scp, especially when the copies were done between different operating systems and file systems. That is why I never use scp, but only rsync over ssh.

Rsync is the only program that I have seen, which is able (with the right options) to make perfect file copies even between different operating systems and file systems (for instance between FreeBSD with UFS and Linux with XFS), preserving also metadata like extended file attributes, access control lists and high-precision file timestamps (some copying programs and archiving programs truncate high-precision timestamps).

The current documentation of openrsync does not make any guarantee that it can make complete file copies, so by default I assume that it cannot, so for now it is a program that I consider useless.

Beside rsync for copying, one of the few Linux archiving programs that can archive perfect file copies is bsdtar (when using the pax file format; the ancient tar and cpio file formats cannot store all modern file metadata).

(FYI: I always alias rsync to '/usr/bin/rsync --archive --xattrs --acls --hard-links --progress --rsh="ssh -p XXX -l YYYYYYY"')

(With the right CLI options, "cp" from coreutils can make perfect file copies, but only if it has been compiled with appropriate options; some Linux distributions compile coreutils with wrong options, e.g. without extended file attributes support, in which case "cp" makes only partial file copies, without giving any warnings or errors.)

inglor

As a contrast to your take - I work for a backup company and I was really surprised to discover most of our customers (big enterprises) really do not care about 99% of metadata restored correctly and are fine with just restoring the data.

(We restore everything super carefully but sometimes I feel like we're the only ones who care)

nolok

I'm willing to bet a decent number "don't care" until they do care because their permissions don't work or their time based script screws up or whatever else nobody thinks about when they're in panic mode about "I lost my data".

ExoticPearTree

In case of a complete disaster recovery, the fact that a script or two might fail is super OK. That's why after recovery there's always the cleanup phase where you fix stuff that broke during recovery.

mmcnl

They don't care because you care, so they never experienced the misfortune of not caring.

crabbone

Nah. Not really. A lot of the useful data out there doesn't need ACL, precise (or any dates at all) etc.

Also, a lot of application-specific data formats already don't care about the "extra" attributes available in various filesystems because those aren't universally supported and implement them themselves in the file format they operate on. For example, DICOMs, or password-protected PDFs or Zip archives etc.

mbrumlow

I too have worked at a back company and I can’t recall any of the customers caring or even knowing about the metadata.

We would only care if the software our customers were running did. Big enterprise software suits were defined to run in hostile environments, in such they mostly rely on their data formats and don’t care about attributes from the filesystem other than so they have access.

mohas

I'm with you on this, I think that data is 99% of what is important and the rest can be recreated or improvised and if in your system you rely too much on file metadata your need more engineering

buttercraft

> in your system you rely too much on file metadata your need more engineering

Except sometimes it's a 3rd party's app whose data you have to restore, and you don't have control over their engineers.

ForHackernews

If that information drives operational processes then you can argue it is data, not metadata.

m463

That's the kind of nonsense thinking that leads to folks like apple removing critical features that "noone uses".

reminds me of that yogi berra quote "nobody goes there anymore, it's too crowded"

For example, many people don't even understand target disk mode on apple hardware, but it has saved me countless hours over the years and made administering apple systems a breeze. Ask people who've used target display mode if they can imagine going without it.

on another subject - it's worth mentioning that time machine is based on rsync.

treve

Apple is the wrong horse to back on for this sort of thing.

dcow

> The current documentation of openrsync does not make any guarantee that it can make complete file copies, so by default I assume that it cannot, so for now it is a program that I consider useless.

Is it possible this is just a documentation style-tone mismatch? My default assumption would be that openrsync is simply a less restrictively licensed rsync, and I wouldn’t assume it works any differently. Have you verified your strong hypothesis? Or are you just expressing skepticism. It’s hard to tell exactly.

Edit: I read the openrsync readme. It says it’s compatible with rsync and points the reader to rsync’s docs. Unless extended file attributes, ACLs, and high resolution timestamps are optional at the protocol level, it must support everything modern rsync supports to be considered compatible, right? Or are you suggesting it lies and accepts the full protocol but just e.g. drops ACLs on the floor?

wkat4242

From the article:

> The openrsync command line tool is compatible with rsync, but as noted in the documentation openrsync accepts only a subset of rsync’s command line arguments.

dcow

Yes but that doesn't necessarily mean it is lacking the functionality to fully copy metadata. It could mean that openrsync has removed archaic and vestigial options to simplify implementation.

WhyNotHugo

OpenRsync is from the OpenBSD project. This is typically an indicator of good quality and a good focus on security. However, in this case, even the official website indicates:

> We are still working on it... so please wait.

SoftTalker

OpenBSD often takes an approach of removing rarely-used or archaic functionality to achieve simplicity in code or configuration or improved security. They gutted a lot of openssl when they made libressl. Their OpenSMTPD is vastly simpler than something like postfix or sendmail.

openrsync is very likely good code, but that doesn't mean it replicates every feature of the rsync utility.

onetom

OpenBSD removed ACL support though, iirc

SoftTalker

OpenBSD's filesystem doesn't have them. It just has normal unix permission bits.

graemep

This is a licensing issue for Apple, and only a small proportion of their users will care about this, and those users will just install rsync.

adrian_b

You are right, but I have written my comment exactly for making those users aware of this problem.

I consider this a very serious problem, because most naive users will assume automatically that when they give a file copy command they obtain a perfect duplicate of the original file.

It is surprising for them to discover that this is frequently not true.

sneak

And the rsync that has historically come with macOS was always way out of date, so we end up installing a newer one anyway. This doesn’t change much.

nix0n

I was able to find (in source-code form) the list of what arguments openrsync accepts[0].

Among permissions you use in your alias: --xattrs, --acls, --hard-links are all missing.

[0]https://github.com/kristapsdz/openrsync/blob/a257c0f495af2b5...

karel-3d

Current rsync version Apple ships is from 2006. It predates the iPhone.

adestefan

That's the last GPLv2-only version of rsync.

_fat_santa

Honest question: what prevents Apple from using software that's licensed under GPL v3 vs v2?

scrapheap

What do you mean by perfect copies here? Do you mean the file content itself or are you also including the filesystem attributes related to the file in your definition?

adrian_b

A file consists of data and various metadata, e.g. file name, timestamps, access rights, user-defined file attributes.

By default, a file copy should include everything that is contained in the original file. Sometimes the destination file system cannot store all the original metadata, but in such cases a file copying utility must give a warning that some file metadata has been lost, e.g. like when copying to a FAT file system or to a tmpfs file system as implemented by older Linux kernels. (Many file copy or archiving utilities fail to warn the user when metadata cannot be preserved.)

Some times you may no longer need some of the file metadata, but the user should be the one who chooses to loose some information, it should not be the default behavior, especially when this unexpected behavior is not advertised anywhere in the documentation.

The origin of the problem is that the old UNIX file systems did not support many kinds of modern file metadata, i.e. they did not have access control lists or extended file attributes and the file timestamps had a very low resolution.

When the file systems were modernized (XFS was the first Linux file system supporting such features, then slowly also the other file systems were modernized), most UNIX utilities have not been updated until many years later, and even then the additional features remained disabled by default.

Copying like rsync, between different computers, creates additional problems, because even if e.g. both Windows and Linux have extended file attributes, access control lists and high-resolution file timestamps, the APIs used for accessing file metadata differ between operating systems, so a utility like rsync must contain code able to handle all such APIs, otherwise it will not be able to preserve all file metadata.

scrapheap

But what you're referring to here are the attributes that the file system stores about the file, not the file itself. By default I wouldn't expect a copy of a file to have identical file system attributes, just an identical content for the file. I would expect some of the file system attributes to be copied, but not all of them.

Take the file owner for example if I take a copy of a file then by default I should be the owner of that file as it's my copy of the file, and not the original file owner's copy.

An alternative way of looking at it is if I have created a file on my local machine that's owned by root and has the setuid bit set on it's file permissions then there's no way that I should be able to copy that file up to a server with my normal user account and have those atttibutes still set on the copy.

prmoustache

The cp command does copy the file data but not the metadata. There is a reason we have come up with 2 words to distinguish them.

Rsync only cp the metadata when you specifically ask it to anyway. I haven't had a look at openrsync man page but I would assume it is the same in the case of the later.

fhars

It maens that if you copy a file from NTFS to ext4, ext4 will magically sprout support for alternate data streams.

johnisgood

And all files from NTFS have +x. :|

nickelpro

I actively do not want this in a file copy utility, relying on extended file attributes is a massive anti-pattern. If you care about time stamps, they go in the file format itself. If you care about permissions, those belong in the provisioning and access systems in front of the file. The web application or other API that is providing the access.

I expect file attributes of the target to be what I say they should be, not copied over from wherever the content happened to live before.

shwouchk

you only ever look at files through a web application or other api?

nickelpro

Of course not, but I don't rely on the extended file attributes for anything important such that they need to be replicated during copies

thrdbndndn

As a relatively new Linux user, I often find the "versioning" of bundled system utilities also to be a bit of a mess, for lack of a better word.

A classic example, at least from my experience, is `unzip`. On two of my servers (one running Debian and the other an older Ubuntu), neither of their bundled `unzip` versions can handle AES-256 encrypted ZIP files. But apparently, according to some Stack Overflow posts, some distributions have updated theirs to support it.

So here is what I ran into:

1. I couldn't easily find an "updated" version of `unzip`, even though I assume it exists and is open source.

2. To make things more confusing, they all claim to be "version 6.00", even though they obviously behave differently.

3. Even if I did find the right version, I'm not sure if replacing the system-bundled one is safe or a good idea.

So the end result is that some developer out there (probably volunteering their time) added a great feature to a widely used utility, and yet I still can’t use it. So in a sense, being a core system utility makes `unzip` harder to update than if it were just a third-party tool.

I get that it's probably just as bad if not worse on Windows or macOS when it comes to system utilities. But I honestly expected Linux to handle this kind of thing better.

(Please feel free to correct me if I’ve misunderstood anything or if there’s a better way to approach this.)

adwf

In the specific case here, 7z is your friend for all zips and compressed files in general, not sure I've ever used unzip on Linux.

Related to that, the Unix philosophy of simple tools that do one job and do it well, also applies here a bit. More typical workflow would be a utility to tarball something, then another utility to gzip it, then finally another to encrypt it. Leading to file extensions like .tar.gz.pgp, all from piping commands together.

As for versioning, I'm not entirely sure why your Debian and Ubuntu installs both claim version 6.00, but that's not typical. If this is for a personal machine, I might recommend switching to a rolling release distro like Arch or Manjaro, which at least give upto date packages on a consistent basis, tracking the upstream version. However, this does come with it's own set of maintenance issues and increased expectation of managing it all yourself.

My usual bugbear complaint about Linux (or rather OSS) versioning is that people are far too reluctant to declare v1.00 of their library. Leading to major useful libraries and programs being embedded in the ecosystem, but only reaching something like v0.2 or v0.68 and staying that way for years on end, which can be confusing for people just starting out in the Linux world. They are usually very stable and almost feature complete, but because they aren't finished to perfection according to the original design, people hold off on that final v1 declaration.

Squossifrage

Info-Zip Unzip 6.00 was released in 2009 and has not been updated since. Most Linux distros (and Apple) just ship that 15-plus-year-old code with their own patches on top to fix bugs and improve compatibility with still-maintained but non-free (or less-free) competing implementations. Unfortunately, while the Info-Zip license is pretty liberal when it comes to redistribution and patching, it makes it hard to fork the project; furthermore, anyone who wanted to do so would face the difficult decision of either dropping or trying to continue to support dozens of legacy platforms. Therefore, nobody has stepped up to take charge and unify the many wildly disparate mini-forks.

setopt

> Related to that, the Unix philosophy of simple tools that do one job and do it well, also applies here a bit. More typical workflow would be a utility to tarball something, then another utility to gzip it, then finally another to encrypt it. Leading to file extensions like .tar.gz.pgp, all from piping commands together.

I do this for my own files, but half of the time I zip something, it’s to send it to a Windows user, in which case zip is king.

issafram

fyi latest version of Windows 11 supports native opening of 7zip files

aragilar

The issue in this case is upstream is dead, so there are random patches. Same thing happened to screen for a bit.

DonHopkins

The "Unix Philosophy" is a bankrupt romanticized after the fact rationalization to make up excuses and justifications for ridiculous ancient vestigial historic baggage like the lack of shared libraries and decent scripting languages, where you had to shell out THREE heavyweight processes -- "[" and "expr" and a sub-shell -- with an inexplicable flurry of punctuation [ "$(expr 1 + 1)" -eq 2 ] just to test if 1 + 1 = 2, even though the processor has single cycle instructions to add two numbers and test for equality.

chubot

??? This complaint seems more than 20 years too late

Arithmetic is built into POSIX shell, and it's universally implemented. The following works in basically every shell, and starts 0 new processes, not 2:

    $ bash -c '[ $((1 + 1)) = 2 ]; echo $?'
    0
    $ zsh -c '[ $((1 + 1)) = 2 ]; echo $?'
    0
    $ busybox ash -c '[ $((1 + 1)) = 2 ]; echo $?'
    0
YSH (part of https://oils.pub/ ) has a more familiar C- or JavaScript-like syntax:

    $ ysh -c 'if (1 + 1 === 2) { echo hi }'
    hi
It also has structured data types like Python or JS:

    $ echo '{"foo": 42}' > test.json
    $ ysh
    ysh-0.28$ json read < test.json
    ysh-0.28$ echo "next = $[_reply.foo + 1]"
    next = 43
and floats, etc.

    $ echo "q = $[_reply.foo / 5]"
    q = 8.4
https://oils.pub/release/latest/doc/ysh-tour.html (It's probably more useful for scripting now, but it's also an interactive shell)

verandaguy

    > TWO heavyweight processes
If you're going to emphasize that it's two processes, at least make sure it's actually two processes. `[` is a shell builtin.

    > `eval` being heavy
If you want a more lightweight option, `calc` is available and generally better-suited.

    > inexplicable flurry of punctuation
It's very explicable. It's actually exceptionally well-documented. Shell scripting isn't syntactically easy, which is an artifact of its time plus standardization. The bourne shell dates back to 1979, and POSIX has made backwards-compatibility a priority between editions.

In this case:

- `[` and `]` delimit a test expression

- `"..."` ensure that the result of an expression is always treated as a single-token string rather than splitting a token into multiple based on spaces, which is the default behaviour (and an artifact of sh and bash's basic type system)

- `$(...)` denotes that the expression between the parens gets run in a subshell

- `-eq` is used for numerical comparison since POSIX shells default to string comparison using the normal `=` equals sign (which is, again, a limitation of the type system and a practical compromise)

    > even though the processor has single cycle instructions to add two numbers and test for equality
I don't really understand what this argument is trying to argue for; shell scripting languages are, for practical reasons, usually interpreted, and in the POSIX case, they usually don't have to be fast since they're usually just used to delegate operations off to other code for performance. Their main priority is ease of interop with their domain.

If I wanted to test if one plus one equals two at a multi-terabit-per-second bandwidth I'd write a C program for it that forces AVX512 use via inline assembly, but at that point I think I'd have lost the plot a bit.

whatnow37373

Shell != Unix (philosophy) as I’m sure you are aware. The unix philosophy is having a shell and being able to replace it, not its particular idiosyncrasies at any moment in time.

This is like bashing Windows for the look of its buttons.

a-french-anon

I don't see what crusty implementation details have to do with a philosophy. In fact, UNIX itself is a poor implementation of the "UNIX" philosophy, which is why Plan 9 exists.

The idea of small composable tools doing one thing and doing it well may have been mostly an ideal (and now pretty niche), but I don't think it was purely invented after the fact. Just crippled by the "worse is better".

pjmlp

The "Unix Philosophy" is some cargo cult among FOSS folks that never used commercial UNIX systems, since Xenix I haven't used any that doesn't have endless options on their man pages.

eesmith

I realized the hype for the Unix Philosophy was overblown around 1993 when I learned Perl and almost immediately stopped using a dozen different command-line tools.

tecleandor

Was there any problem with 7z some years ago? I feel like I've been actively avoiding it for having the feeling that I've read something bad about it, but I can't remember what. But I could've mixed it with something else. It sometimes happens to me.

oblio

Hard to say for sure, did SourceForge put malware in their installers many millennia ago?

pxc

I came here to make the same recommendation. Just use p7zip for everything; no need to learn a bunch of different compression tools.

setopt

If you use `atool`, there is no need to use different tools either – it wraps all the different compression tools behind a single interface (`apack`, `aunpack`, `als`) and chooses the right one based on file extensions.

cogman10

Debian and Ubuntu tend to want to lock the version of a system tools to the version of the OS.

Debian tends to have long release cycles, but is very stable. Everything will work perfectly together on stable (in fact, testing tends to be almost as good at stability vs other OSes).

Ubuntu is basically Debian with "but what if we released more frequently?".

If you want the latest tools, then you'll have to settle for a less stable OS (sort of). Nix and Arch come to mind. Neither are super user friendly.

If you want stable and the latest tools, Gentoo is the way to go. However, it's even more intimidating than Arch.

If you want stability and simplicity, then the other way to go is sacrificing disk space. Docker/podman, flatpak, appcontainers, and snap are all contenders in this field.

Windows and Mac both have the same problem. Windows solved this by basically just shipping old versions of libraries and dynamically linking them in based on what app is running.

chrismorgan

I find it funny calling Arch “less stable”, because I’m inclined to find it more stable, for my purposes, skills and attitudes.

I’ve administered at least one each of: Ubuntu server (set up by another; the rest were by me), Ubuntu desktop at least ten years ago, Arch desktop, Arch server.

The Arch machines get very occasional breakages, generally either very obvious, or signposted well. I did have real trouble once, but that was connected with cutting corners while updating a laptop that had been switched off for two years. (I’ve updated by more than a year at least two other times, with no problems beyond having to update the keyring package manually before doing the rest. The specific corners I cut this one time led to the post-upgrade hooks not running, and I simply forgot to trigger them manually in order to redo the initcpio image, because I was in a hurry. Due to boot process changes, maybe it was zstd stuff, can’t remember, it wouldn’t boot until I fixed it via booting from a USB drive and chrooting into it and running the hooks.)

Now Ubuntu… within a distro release it’s no trouble, except that you’re more likely to need to add external package sources, which will cause trouble later. I feel like Ubuntu release upgrades have caused a lot more pain than Arch ever did. Partly that may be due to differences in the sorts of packages that are installed on the machines, and partly it may be due to having used third-party repositories and/or PPAs, but there were reasons why those things had to be added, whether because software or OS were too old or too new, and none of them would have been needed under Arch (maybe a few AUR packages, but ones where there would have been no trouble). You could say that I saw more trouble from Ubuntu because I was using it wrong, but… it wouldn’t have been suitable without so “using it wrong”.

odo1242

Fedora strikes a pretty good tradeoff on the “is user friendly” and “has latest tools regardless of system version” balance, I would say.

rurban

Exactly. Much more stable and much more uptodate as Debian derivates. But much less packages also.

thayne

"stable" as used to describe debian (and Ubuntu) means "does not change", which includes adding new functionality.

damentz

Correct, another way of looking at it is from a programming angle. If Debian fixes a bug that breaks your tool, then Debian is unstable. Therefore, to maintain stability, Debian must not fix bugs unless they threaten security.

The term "stable" is the most polluted term in Linux, it's not something to be proud of. Similar to how high uptime was a virtue, now it just means your system probably has been pwned at some point.

jjayj

The other option here is "pick an OS and when necessary install newer packages from source."

We've been doing this for a long time at my current workplace (for dev containers) and haven't run into any problems.

tame3902

unzip is a special case: upstream development has basically stopped. The last release was in 2009[0]. (That's the version 6.0.) Since then there were multiple issues discovered and it lacks some features. So everybody patches the hell out of that release[1]. The end result is that you have very different executables with the same version number.

[0]: https://infozip.sourceforge.net/UnZip.html

[1]: here the build recipe from Arch, where you can see the number of patches that are applied: https://gitlab.archlinux.org/archlinux/packaging/packages/un...

blueflow

I maintain a huge number of git mirror of git repositories and i have some overview of activity there. Many open source projects have stopped activity and/or do not make any new releases. Like syslinux, which seems to be in a similar situation as unzip. And some projects like Quagga went completely awol and don't even have a functional git remote.

So unzip is not really that special, its a mode general problem with waning interest.

tame3902

I wasn't trying to imply that unzip is the only one.

But the way I learned that unzip is unmaintained was pretty horrible. I found an old zip file I created ages ago on Windows. Extracting it on Arch caused no problem. But on FreeBSD, filenames containing non-ASCII characters were not decoded correctly. Well, they probably use different projects for unzip, this happens. Wrong, they use the same upstream, but each decided to apply different patches to add features. And some of the patches address nasty bugs.

For something as basic as unzip, my experience as a user is that when it has so many issues, it either gets removed completely or it gets forked. The most reliable way I found to unzip a zip archive consists of a few lines of python.

erinnh

Quagga got forked though and is actively being developed.

FRRouting is the fork.

soraminazuki

Distros are independent projects, so that's to be expected IMO. Though some level of interoperability is nice, diverse options being available is good.

That said, most distros have bsdtar in their repositories so you might want to use that instead. The package might be called libarchive depending on the distro. It can extract pretty much any format with a simple `bsdtar xf path/to/file`. AES is also supported for zips.

macOS includes it by default and Windows too IIRC, in case you're forced to become a paying Microsoft product^Wuser.

__MatrixMan__

It is a mess. My suggestion is to just rely on the built-in stuff as little as possible.

Everything I do gets a git repo and a flake.nix, and direnv activates the environment declared in the flake when I cd to that dir. If I write a script that uses grep, I add the script to the repo and I add pkgs.gnugrep to the flake.nix (also part of the repo).

This way, it's the declared version that gets used, not the system version. Later, when I hop from MacOS to Linux, or visa versa, or to WSL, the flake declares the same version of grep, so the script calls the same version of grep, again avoiding whatever the system has lying around.

It's a flow that I rather like, although many would describe nix as unfriendly to beginniners, so I'm reluctant to outright recommend it precisely. The important part is: declare your dependencies somehow and use only declared dependencies.

Nix is one way to do that, but there's also docker, or you could stick with a particular language ecosystem. python, nodejs, go, rust... they all have ways to bundle and invoke dependencies so you don't have to rely on the system being a certain way and be surprised when it isn't.

A nice side effect of doing this is that when you update your dependencies to newer versions, that ends up in a commit, so if everything breaks you can just check out the old commit and use that instead. And these repos, they don't have to be for software projects--they can just be for "all the tools I need when I'm doing XYZ". I have one for a patio I'm building.

Spivak

This is the way, system packages are for the system. Everything you need lives in .local or in your case /nix. The amount of tooling headaches I've had to deal with is pretty close to zero now that I don't depend on a platform that by design is shifting sand.

NoboruWataya

I use Arch on my personal laptop daily but have Debian installed on a VPS, and this is one aspect of Debian that bugs me (though I totally understand why they do it). I am so used to having the latest version of everything available to me very quickly on Arch, I am quite commonly stung when I try to do something on my VPS only to find that the tools in the Debian repos are a few versions behind and don't yet have the features I have been happily using on Arch. It's particularly frustrating when I have been working on a project on my personal laptop and then try to deploy it on my VPS only to find that all of the dependencies are several versions behind and don't work.

Again, not a criticism of Debian, just a friction I noticed moving between a "bleeding edge" and more stable distro regularly.

everfrustrated

If you want the latest version of everything you are looking for Debian Unstable

MisterTea

> As a relatively new Linux user,

You need to understand that you are now in Unix land which means you compose this pipeline using programs that perform each step of the processes. So when creating an encrypted backup you would use: `tar -c /home/foo | gzip | aescrypt >backup.tgz.aes` or something to that effect. This lets you use whatever compression program in the pipe.

Breaking this composability leads to the kind of problem you are complaining about. It also removes the ability of splitting this pipeline across machines allowing you to distribute the compute cost.

procaryote

Compressing and encrypting as separate operations would bypass this issue.

A symmetrically encrypted foo.zip.gpg or foo.tgz.gpg would work in a lot more places than a bleeding edge zip version. Also you get better tested and audited encryption code

duskwuff

On one hand, it's a little annoying that openrsync doesn't support some features that rsync does.

On the other hand, it's great that there are multiple independent implementations of rsync now. It means that it's actually being treated as a protocol, not just a piece of software.

varenc

I'm exciting about this too. It becoming more like a protocol makes me optimistic we'll see binary diff API points based on the rsync algorithm.

fun fact: Dropbox internally used rsync binary diff to quickly upload small changes to large file. I assume they still do. But their public API endpoints don't offer this and a small change to a large file means the whole file must be updated.

zmj

I implemented rsync's binary diff/patch in .NET several years ago: https://github.com/zmj/rsync-delta

It's a decent protocol, but it has shortcomings. I'd expect most future use cases for that kind of thing to reach for a content-defined chunking algorithm tuned towards their common file formats and sizes.

andrewflnr

> binary diff API points based on the rsync algorithm

Now that's an idea I never considered. Nice.

nine_k

Now consider applying it to git. How about clean semantic diffs to your .xlsx files? To your .PNG files?

secure

Indeed! Have a look at http://github.com/stapelberg/rsync-over-grpc/, where I demonstrate how to use the rsync protocol (specifically, my https://github.com/gokrazy/rsync implementation) over gRPC.

Very handy if SSH+rsync is locked down in your corporate environment, but building services with gRPC isn’t :)

chungy

The website says "We are still working on it... so please wait."

rsync has a lot of features, surely this will take a good amount of time.

drob518

librsync, anyone?

edoceo

LGPL

mattl

librsync is distributed under the GNU LGPL v2.1

I can see no reason why Apple wouldn't be fine with that.

DrillShopper

Maybe Apple should stop leeching off Free Software then

candiddevmike

How does this mean rsync is a protocol?

somat

it was always a protocol, however it is never good when the protocol is defined by it's only implementation

My understanding is that this is the whole reason for the existence of openrsync. The people doing work on the rpki standards wanted to use rsync for one type of transfer, the standards body (IETF?) balked with a concern that the rsync protocol had only one implementation, so the openbsd folk, specifically Kristaps Dzonson stepped up and wrote a second implementation. It does not do everything rsync does but it interoperates enough for the rpki project.

https://man.openbsd.org/rpki-client

superkuh

>however it is never good when the protocol is defined by it's only implementation

One counter-example to this is in desktop GUI environments. You want one single strong reference implementation there for stability/consistent expectations of what will run. Pretty much everything that will run on the eleventh X protocol will work X.orgs X11 everywhere. Whereas the core wayland protocol is not feature complete and the reference implementation weston is weak. So every wayland compositor implements what should be core wayland protocol features in their own choice of third party lib or custom code. Like libei vs libinput vs no support at all (weston) for normal keyboard/mouse features. Software that works on one wayland won't work on others.

My point here is that strong single reference implementations prevent fragmentation. And sometimes that's important. This is not one of those cases and I'm glad to see more rsync protocol implementations.

josephg

> it is never good when the protocol is defined by it's only implementation

I don't know that I'd go that far. The benefit of having only one implementation of a protocol is that the protocol can evolve much faster. You don't have to have committee meetings to tweak how it works. And as a first pass, the more iterations you make of something, the better the result.

Rsync is mature enough to benefit from multiple implementations. But I'm glad it had some time to iterate on the protocol first.

bombela

Think ssh, http etc

null

[deleted]

watersb

Patches to mainline rsync added support for extended attributes, particularly for supporting macOS metadata.

Bombich "Carbon Copy Cloner" is a GUI app that wraps it.

https://support.bombich.com/hc/en-us/articles/20686446501143...

I started following Mike Bombich from his posts on macOS Server sysadmin boards; see

https://web.archive.org/web/20140707182312/http://static.afp...

Nathaniel Gray created a testing tool to verify the fidelity of backups; files with multiple streams, extended attributes and ACLs, all the good stuff... Backup Bouncer:

https://github.com/n8gray/Backup-Bouncer

See also this SwiftUI app that wraps rsync, RsyncX.

https://github.com/rsyncOSX/RsyncOSX

We used to really care about this stuff, back when we were still running software from "Classic" macOS on top of our new UNIX systems.

https://web.archive.org/web/20161022012615/http://blog.plast...

doctorpangloss

The problem with rsync is that it is ridiculously slow.

IFileOperation (Windows) and FileManager (macOS) will do the most performant copy supported by the underlying FS.

Enabling CRC checks is a checkbox in SMB and ReFS - rsync's content matching step is redundant to a modern SMB share on a modern Windows Server. Windows to Windows, IFileOperation will be like 1.5-8x faster throughput with lower CPU usage than rsync, and maybe 1.2-3x faster than doing a file copy using vanilla golang.

And if you don't care about the operating systems that actually use all the complex filesystem metadata, if you only care about Linux, then you only need openrsync or simpler programs.

jeroenhd

So, anyone got a good resource on why Apple is so afraid of GPLv3? Surely this shouldn't be a problem as long as they statically compile the executables?

ninkendo

GPL3 closes what was considered a loophole, where device makers would ship a product derived from GPL’d code, and release the source, but provide no ability for users to actually compile and run that source on the device (this was called “tivo-ization” at the time, because TiVo did it.)

So for iOS, it’s pretty obvious why they don’t use gplv3… because it would violate the terms.

For macOS they could certainly get away with shipping gplv3 code, but they do a lot of code sharing between iOS and macOS (and watchOS/tvOS/visionOS/etc) and it doesn’t make much sense to build on a gplv3 foundation for just one of these operating systems and not the others. So it’s simpler to just not use it at all.

It also means they’re more free to lock down macOS from running your own code on it in the future, without worrying about having to rip out all the gpl3 code when it happens. Better to just not build on it in the first place.

mappu

> this was called “tivo-ization” at the time, because TiVo did it.

It's not widely known but what TiVo actually did was something different than this, and both RMS and the SFC believe that both the GPLv2 and GPLv3 allow what TiVo actually did. Some discussion and further links via https://lwn.net/Articles/858905/

imcritic

I'm just curious: do you have that link bookmarked?

duskwuff

Current versions of macOS use a signed system volume [1], much like iOS - under a standard system configuration, the user can't replace system executables or other files, even as root. Unlike iOS, the user can disable SSV, but I'm not certain that's sufficient for GPLv3 - and I can't imagine Apple feels comfortable with that ambiguity.

[1]: https://support.apple.com/guide/security/signed-system-volum...

ezfe

By the GNU website it would be sufficient. The website says:

> GPLv3 stops tivoization by requiring the distributor to provide you with whatever information or data is necessary to install modified software on the device

By my reading of this, there is not a requirement that the operating system is unlocked, but the device. Being able to install an alternate operating system should meet the requirement to "install modified software on the device."

> This may be as simple as a set of instructions, or it may include special data such as cryptographic keys or information about how to bypass an integrity check in the hardware.

As you've mentioned with disabling SSV, and as Asahi Linux has shown, Apple Silicon hardware can run 3rd party operating systems without any problems.

chongli

Sure, though there's little point in replacing executables such as rsync when you can install your own version (perhaps through a package manager and package repository / database such as Homebrew [1] or MacPorts [2]) and use the PATH environment variable to decide which version of the executable you'd like to use in which context.

[1] https://brew.sh

[2] https://www.macports.org

jillyboel

Ew, how hostile.

troyvit

> Current versions of macOS use a signed system volume

Sometimes I feel like I'm deluding myself with the small inconveniences I put myself through only using Linux, but finding out about stuff like this wipes that away.

pabs3

TiVo didn't do that, they broke their proprietary software when it ran on a modified version of the GPLed Linux kernel.

Also, GPLv2 requires the ability to modify and reinstall, just like GPLv3.

https://sfconservancy.org/blog/2021/mar/25/install-gplv2/ https://sfconservancy.org/blog/2021/jul/23/tivoization-and-t...

Neither GPLv2 nor GPLv3 prevent what TiVo actually did.

https://events19.linuxfoundation.org/wp-content/uploads/2017...

harry8

> So for iOS, it’s pretty obvious why they don’t use gplv3… because it would violate the terms.

Apple using "openrsync" because they want to close the code more than the rsync license lets them.

mattl

I’m not sure they care about rsync’s code, they probably just don’t want to maintain an old fork of rsync under GPLv2.

jitl

> It also means they’re more free to lock down macOS from running your own code on it in the future, without worrying about having to rip out all the gpl3 code when it happens. Better to just not build on it in the first place.

how does locking down macOS have anything to do w/ GPL compliance? Apple is free to do whatever BS with the OS they ship in terms of terminal access, user permission level, etc regardless of GPL of any code on the device. I could ship a GPLv3 system tomorrow that disallows user root access and as long as I make the OS source freely available and redistributable, it's fine.

ninkendo

If you make a device which uses GPL’d code, and provide all the covered source code you used, but prevent users from putting any modified code on the device, you are in violation of GPLv3, but not GPLv2. That means this sentence:

> I could ship a GPLv3 system tomorrow that disallows user root access and as long as I make the OS source freely available and redistributable, it's fine.

Is not true for gpl3. It’s called the “tivo-ization” loophole, and it’s one of the principal reasons the GPL3 was made in the first place. I think you’re just wrong.

(Note: I’m not claiming Apple is would be in violation for shipping e.g. a GPLv3 bash on macOS, today, only that they would be in violation for doing that on iOS today, or if in the future they locked down macOS in the same way that iOS was, then for macOS too.)

null

[deleted]

null

[deleted]

NewsaHackO

[flagged]

p0w3n3d

> they’re more free to lock down macOS from running your own code on it in the future, without worrying about having to rip out all the gpl3 code when it happens. Better to just not build on it in the first place.

That's actually quite scary what you wrote there.

That's also even more scary to me, as I am really watchful for such restrictions which can IMO happen in current OSes any time now ...

kijiki

This is really easy, just use Linux.

KerrAvon

No, this doesn't quite scan, because there's no reason they couldn't ship a current of `bash` or any number of other GPL3 things. Aurornis is probably closest to the mark: it is legally ambiguous, and Apple probably does not want to be a test case for GPL3 compliance.

ninkendo

If they shipped a gpl3 version of bash on iOS, they would be in violation. This isn’t really a question: gpl3 requires you to not only provide the source if you use it in a product, but the ability to modify it and run your modified version. Which iOS doesn’t let you do.

Now, macOS would be fine in shipping a gpl3 bash. But not iOS. (Yes, iOS has bash. Or ar least they used to, they may be all on zsh now, I’m not sure.)

So, the question becomes to Apple, do we ship different bash versions for different devices, and treat macOS as being different, and have to worry about only using newer bash features on macOS? Or do we keep the same old version on all platforms, and just eschew the new bash everywhere? It’s a pretty simple decision IMO, especially because users can just use brew on macOS and put their own bash on there if they want.

Others are pointing out that gpl3 is less tested in court and that lawyers are just more uncertain/afraid of gpl3 than gpl2, especially with respect to patents… but I don’t think these are mutually exclusive. It’s clear that they can’t ship gpl3 on 4 out of their 5 operating systems. macOS is an outlier, and from an engineering standpoint it’s a lot simpler to just keep them all the same than it is to ship different scripts/etc for different platforms. It can be both reasons.

Someone

> For macOS they could certainly get away with shipping gplv3 code

Even limiting that to “in the USA” I would never say certainly for a license for which so little jurisprudence exists.

Once you add in multiple countries, it doesn’t get clearer.

And yes, that applies to GPLv2, too, but that ship has sailed. I also don’t see them add much new GPLv2 licensed software.

For GPLv3, they also may be concerned about patents. If, to support some MacOS feature, they change a GPLv3 licensed program that uses one of their patents, GPLv3 gives others the rights to use those patents in versions of the tool that run on other platforms.

Aurornis

My perspective on GPL and related licenses changed a lot after working with lawyers on the topic. Some of the things I thought to be completely safe were not as definitive to the lawyers.

I don’t know Apple’s reasoning, but I know that choosing non-GPL licenses when available was one of the guiding principals given to us by corporate lawyers at another company.

cosmic_cheese

A lot of it is indeed the legal murkiness.

On the engineering level, other liceneses likely get selected because it’s easy. You don’t need to consult the legal department to know how to comply with licenses like MIT, BSD, etc, so you just pull the thing in, make any required attributions, and continue on with your day. It’s a lot less friction, which is extremely attractive.

KerrAvon

Yes, although even for the more liberal licenses you actually still want legal review at a sufficiently large company to ensure that your engineering read of the license is accurate. What if someone changed the wording slightly in some way that turns out to be legally significant, etc.

butchlugrod

I work at a large corporation, but one that only has 6% of Apple’s annual revenue. Even the emails we send to end users get a review from the legal team prior to us hitting send.

Yeah, there are some assumptions which can be made about licenses and their suitability for our purposes, but no serious organization is touching that code until there has been a full audit of those license terms and the origin of every commit to the repository.

pjmlp

The kind of places I usually work for, you do need to consult with legal regardless of the license.

And to prevent your scenario, usually CI/CD systems are gapped to internal repos, unless dependencies are validated and uploaded into those repos, the build is going to break.

giantrobot

This was basically the justification I was told when I was at Apple. The GPLv3 is too viral for the liking of Apple's legal department. They do not want to be the test case for the license.

quotemstr

The funny thing is that the rest of the world has moved on and is no longer afraid of the GPLv3. The reality that people aren't, as Apple's legal people predicted, being legally obliterated hasn't changed Apple legal's stance. Doomsday cults actually get stronger when doomsday fails to drive.

palata

> but I know that choosing non-GPL licenses when available was one of the guiding principals

Sure, but in this case Apple has chosen, for 20 years, to not go with GPLv3 when there was no alternative.

sbuk

You could also say the same of the Linux kernel too. After all, they have chosen, for 20 years, to not go with GPLv3…

jillesvangurp

I've had similar training back in the day. This was when my employer (Nokia) was making Linux based phones and they needed to educate their engineers on what was and wasn't legally dodgy to stay out of trouble. Gplv2 was OK with permission (with appropriate measures to limit its effect). Particularly with Java, you had to be aware of the so-called classpath exception Sun added to make sure things like dynamic linking of jar files would not get you into trouble. Permissive licenses like Apache 2.0, MIT, and BSD were not considered a problem. GPLv3 was simply a hard no. You'd get no permission to use it, contribute to it, etc.

Apple, Nokia, and many other large companies, employ lawyers that advice them to steer clear of things like GPLv3. The history of that particular license is that it tried to make a few things stricter relative to GPLv2 which unintentionally allowed for things like commercial Linux distributions mixing closed and open source. That's why Android exists and is Linux based, for example. That could not have happened without the loopholes in GPLv2. In a way that was a happy accident and definitely not what the authors of that license had in mind when they wrote the GPL.

It's this intention that is the problem. GPLv3 might fail to live up to its intentions in some respects because of untested (in court), ambiguous clauses, etc. like its predecessor. But the intention is clearly against the notion of mixing proprietary and OSS code. Which, like it or not, is what a lot of big companies do for a living. So, Apple is respecting licenses like this by keeping anything tainted by it at arms length and just not dealing with it.

pjmlp

As someone on the Networks side, I had the pleasure to write multiple Excel files with all the dependencies of our product listing all the relevant facts for every single jar file.

ants_everywhere

I'm curious if you remember any of the specifics.

At a big company I worked for, GPL licenses were strictly forbidden. But I got the vibe that was more about not wanting to wind up in a giant court case because of engineers not being careful in how they combined code.

I'd be super curious if there are explicit intentional acts that people generally think are okay under GPL but where lawyers feel the risk is too high.

squiggleblaz

Linking against GPL code on a backend server which is never distributed - neither in code or binary form. (Because what might happen tomorrow? Maybe now you want to allow enterprise on prem.)

ndiddy

> Some of the things I thought to be completely safe were not as definitive to the lawyers.

Can you elaborate?

ndegruchy

In all likelihood they just don't want to broach the idea of having to fight (and potentially lose) the GPL3 in court. Given the case history on the GPL2, it seems like more work than it's worth. They can just replace the parts that are "problematic" in their eyes and avoid a whole class of issues.

m463

"Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software." -- https://www.gnu.org/licenses/gpl-3.0.en.html

Arnt

Apple doesn't say. IMO you should not trust other people's statements about Apple's reasoning.

null

[deleted]

pjmlp

Not only Apple, everyone.

I never worked at any company that allows for GPLv3 dependencies, and even GPLv2 aren't welcomed, unless validated by legal team first.

jeroenhd

But Apple did ship GPLv2. They're shipping open source tools with infectious licenses like they always did.

This isn't like the normal "take someone else's work for free but don't give anything back" approach most companies follow when they decide to avoid GPL code.

toast0

They're respecting the terms of the license.

Especially when a piece of software changes from GPLv2 to GPLv3, it's asking Apple to stop updating, and they do as asked.

secure

I looked at openrsync when I was writing my own https://github.com/gokrazy/rsync implementation (in Go!) and it’s good code :)

It’s a shame that openrsync is not 100% compatible with rsync — I noticed that Apple was starting to switch to openrsync because my own tests broke on macOS 15.

Symbiote

> openrsync is written as part of the rpki-client(1) project, an RPKI validator for OpenBSD. openrsync was funded by NetNod, IIS.SE, SUNET and 6connect.

Could anyone suggest why these organizations would want to fund this development?

https://github.com/kristapsdz/openrsync?tab=readme-ov-file#p...

jimsmart

This comment explains the reason for its existence quite well:

https://news.ycombinator.com/item?id=43605846

Companies fund things because they're useful or necessary. My guess is that some of the companies listed might use BSD — and perhaps wanted/needed an implementation of rsync that was not GPL3 licensed.

And/or they simply have an interest in funding Open Source projects / development.

Squossifrage

Three out of four aren't even companies. SUNET is the Swedish NREN, NetNod is a non-profit that manages Internet infrastructure services (like DNS and NTP) in Sweden, IIS is the non-profit that manages the Swedish TLDs.

jimsmart

Feel free to substitute my use of the word "company", with "company / organisation / foundation". Plus others I'm surely forgetting.

I meant 'company' in the sense of a legal entity, probably paying some kind of tax, probably having to register/file their accounts every year. Here in the UK, all of these various different types of 'companies' all have to register with Companies House, and file tax returns to HMRC. 'Company' is the overarching legal term here.

— But sure, my bad: the post I was replying to actually used a term that is arguably better, 'organisations'. And I should have used that.

But my point still stands, whether a private limited company, or a non-profit of some kind, or an organisation, or a foundation, or a charity, or whatever — they're all legal entities of some kind — and they're all able to fund anything they please, if they see value in it.

- NetNod is actually a private limited company according to Wikipedia [1]. Corporate identity number: 556534-0014.

- Swedish Internet Foundation, formerly IIS, have corporate identity number: 802405-0190 (on their website [2])

- Sunet is a department of the Swedish Research Council, and uses the Swedish Research Council’s corporate identity number 2021005208, according to their website [3]

So they are all registered with the Swedish Companies Registration Office. Which I assume is their equivalent of Companies House here in the UK.

Maybe if you still think that they're not 'companies' — of some kind — then perhaps take it up with the Swedish Companies Registration Office! ;)

[1] https://en.wikipedia.org/wiki/Netnod

[2] https://internetstiftelsen.se/en/

[3] https://www.sunet.se/en/contact

0x0

I recently ran into an issue with this because building an iOS .ipa from the command line with xcodebuild apparently ends up shelling out to call rsync to copy some files between local directories, and because I had homebrew rsync earlier in $PATH, it would end up running homebrew rsync, but xcodebuild passed an openrsync-only command line argument "--extended-attributes" that homebrew rsync doesn't understand and would exit with a failure.

emmelaich

For a while, (up to including Sequioa 15.3) both rsync_samba and rsync_openrsync were available, via /var/select/rsync or the env variable CHOSEN_RSYNC.

One particular annoyance of openrsync is that it claimed to support the /./ magic path element for --relative. I sent a bug report to Apple for this about a month ago.

rsync_samba is gone as of Sequoia 15.4.

I've installed rsync from homebrew.

abotsis

I continue to be happy that Apple continues to enhance and embrace the posix side of osx vs gradually stripping it away in some kind of attempt to make it more like iOS.

fmajid

Just like they replaced bash with zsh. Most Big Tech firms are allergic to GPL3.

7e

GPLv3 is a legal landmine. In fact, GPL itself is wildly unpopular compared to more open licenses. The FSF is getting what it deserve here. Open source predates the FSF and will remain long after the FSF is dead.

mcstafford

Whose popularity do you champion, and what sorts of motive bring deservedness in to the discussion?

wanderingmind

Can you show examples of impactful open software that predates fsf and stallman?

donnachangstein

BSD predates the Stallman Utilities (kernel sold separately) by about a decade.*

* in "shared source" form

emmelaich

Sharing (typically via tape) of software utilities use to be very common in every user group from the start (1960s). It was just the culture, and expected. Especially IBM mainframe users, DEC VMS.

Of course the answer to your question depends on the definition of 'open source' and 'impactful'.

anthk

Thanks to the FSF we have cheap Unix clones with easy installs. Even Android should thank the FSF for its existence.

handsclean

“Pesticide wildly unpopular with pests.”

gtsop

Which is why I dislike non gplv3 open source software so much. It allows the pests to live on.

man4

[dead]

tiffanyh

Looks like OpenBSD maintains openrsync.

https://github.com/kristapsdz/openrsync

emchammer

Apple could do worse than importing tools from the OpenBSD Project. Now there are several more commands that would be helpful...