Skip to content(if available)orjump to list(if available)

7-Zip for Windows can now use more than 64 CPU threads for compression

d33

I worry that 7-Zip is going to lose relevance because lack of zstd support. zlib's performance is intolerable for large files and zlib-ng's SIMD implementation only helps here a bit. Which is a shame, because 7-Zip is a pretty amazing container format, especially with its encryption and file splitting capabilities.

dikei

I use ZSTD a ton in my programming work where efficiency matters.

But for sharing files with other people, ZIP is still king. Even 7z or RAR is niche. Everyone can open a ZIP file, and they don't really care if the file is a few MBs bigger.

cesarb

> Everyone can open a ZIP file, and they don't really care if the file is a few MBs bigger.

You can use ZSTD with ZIP files too! It's compression method 93 (see https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT which is the official ZIP file specification).

Which reveals that "everyone can open a ZIP file" is a lie. Sure, everyone can open a ZIP file, as long as that file uses only a limited subset of the ZIP format features. Which is why formats which use ZIP as a base (Java JAR files, OpenDocument files, new Office files) standardize such a subset; but for general-purpose ZIP files, there's no such standard.

(I have encountered such ZIP files in the wild; "unzip" can't decompress them, though p7zip worked for these particular ZIP files.)

throw0101d

> You can use ZSTD with ZIP files too!

Support for which was added in 2020:

> On 15 June 2020, Zstandard was implemented in version 6.3.8 of the zip file format with codec number 93, deprecating the previous codec number of 20 as it was implemented in version 6.3.7, released on 1 June.[36][37]

* https://en.wikipedia.org/wiki/Zstd#Usage

So I'm not sure how widely deployed it would be.

dikei

Well, only a lunatic would use ZIP with anything but DEFLATE/DEFLATE64

easton

> new Office files

I know what you mean, I’m not being pedantic, but I just realized it’s been 19 years. I wonder when we’ll start calling them “Office files”.

sidewndr46

Same thing with "WAV" files. There's at least 3 popular formats for the audio data out there.

guappa

You can and I've done it… but you can't expect anything to be able to decompress it unless you wrote it yourself.

justin66

> Copyright (c) 1989 - 2014, 2018, 2019, 2020, 2022

Mostly it seems nutty that, after all these years, they’re still updating the zip spec instead of moving on to a newer format.

null

[deleted]

mrWiz

My main use case for 7z is bypassing corporate filters that block ZIPs from being sent.

starik36

I think gmail is onto you. They blocked one of my 7z files the other day.

notepad0x90

I don't know about, had a dicey situation recently where powershell's compress-archive couldn't handle archives >4GB and had to use 7zip. it is more reliable and you can ship 7za.exe or create self-extracting archives (wish those were more of a thing outside of the windows world).

chasil

In the realm of POSIX.2 and UNIX relatives, the closest analog would be a "shar" archive.

They are not regarded kindly.

https://en.wikipedia.org/wiki/Shar_(file_format)

null

[deleted]

jart

Use the pigz command for parallel gzip. Mark Adler also has an example floating around somewhere about how to implement basically the same thing using Z_BLOCK.

sidewndr46

What are you compressing with zstd? I had to do this recently and the "xz" utility still blows it away in terms of compression ratio. In terms of memory and CPU usage, zstd wins by a large margin. But in my case I only really cared about compression ratio

vlovich123

people tend to care about decompression speed - xz can be quite slow decompressing super compressed files whereas zstd decompression speed is largely independent of that.

People also tend to care about how much time they spend on compression for each incremental % of compression performance and zstd tends to be a Pareto frontier for that (at least for open source algorithms)

Szpadel

in my experience using zstd --long --ultra -22 gives marginally better compression ratio than xz -9 while being significantly faster

xxs

do you have examples where xz 'blows it away', not just zstd -3?

Night_Thastus

7-zip is the de-facto tool on Windows and has been for a long time. It's more than fast and compressed enough for 99% of peoples use cases.

It's not going anywhere anytime soon.

The more likely thing to eat into its relevance is now that Windows has built-in basic support for zipping/unzipping EDIT: other formats*, which relegates 7-zip to more niche uses.

Bender

7-zip is the de-facto tool on Windows and has been for a long time.

Agreed. The only thing I think it has been missing is PAR support. I think they should consider incorporating one of the par2cmdline forks and porting that code to Windows as well so that it has recovery options similar to WinRAR. It's not used by everyone but that should deprecate any use cases for WinRAR in my opinion.

malfist

Windows has had built in zip/unzip since vista. 7zip is far superior (and the install base proves that)

Night_Thastus

As mentioned in another comment, zip support actually goes further back as far as '98, but only Windows 11 added support for handling other formats like RAR/7-Zip/.tar/.tar.gz/.tar.bz2/etc.

That allows it to be a default that 'just works' for most people without installing anything extra.

The vast majority of users don't care about the extra performance or functionality of a tool like 7-zip. They just need a way to open and send files and the Windows built-in tool is 'good enough' for them.

I agree that 7-zip is better, but most users simply do not care.

iamleppert

Windows unzip is so ungodly slow and terrible! Long live 7zip!

izzydata

Is there something different about the built in zip context menu functionality now than before? I'm pretty sure you could convert something to a zip file since forever ago by right clicking any file.

Night_Thastus

It could support basic ZIP files, but only Windows 11 added support for 7-Zip (.7z), RAR (.rar), TAR, and TAR variants (like .tar.gz, .tar.bz2, etc).

That makes it 'good enough' for the vast majority of people, even if it's not as fast or fully-featured as 7-Zip.

Beretta_Vexee

You are looking for 7-Zip Zstd: https://github.com/mcmilk/7-Zip-Zstd

I don't know what your use case is, but it seems to be quite a niche.

zx2c4

I was curious upon seeing this and found the thread where its inclusion was turned down: https://sourceforge.net/p/sevenzip/discussion/45797/thread/a...

rf15

Not that many people care about zstd; I would assume most 7-zip users care about the convenience of the gui.

snickerdoodle12

That's why 7zip should support it. People care about the convenience of the GUI and we all benefit from better compression being accessible with a nice GUI.

arp242

It's been a long time since I used Windows, but back in the day I used 7-Zip exactly because it could open more or less $anything. That's also why we installed it on many customer computers.

On Linux bsdtar/libarchive gives a similar experience: "tar xf file" works on most things.

devilbunny

7-Zip is like VLC: maybe not the best, but it’s free (speech and beer) and handles almost anything you throw at it. For personal use, I don’t care much about efficient compression either computationally or in terms of storage; I just want “tar, but won’t make a 700 MB blank ISO9660 image take 700 MB”.

KronisLV

That's basically me! I really like 7-Zip because it opens most archive formats I have to work with and also the .7z format has pretty good compression for the stuff I want to store longer term.

Beretta_Vexee

I just hope that the recipient will be able to open the file without too much difficulty. I am willing to sacrifice a few megabytes if necessary.

jorvi

.. but 7-zip has a pretty terrible GUI?

Hence why PeaZip is so popular, and J-Zip used to be before it was stuffed with adware.

sidewndr46

If you're expecting a "mobile first" or similar GUI where most of the screen is dedicated to whitespace, basic features involves 7 or more mouse clicks and for some reason it all gets changed every ~6 months then yes the 7zip GUI is terrible.

Desktop software usability peaked sometime in the late 90s, early 2000s. There's a reason why 7zip still looks like ~2004

general1726

Most people won't use that GUI, but will right click file or folder -> 7-Zip -> Add To ... and it will spit out a file without questions.

Granted Windows 11 has started doing the same for its zip and 7zip compressors.

Same trick goes for opening archives or executables (Installers) as archives.

m-schuetz

All the GUI I need is right click-> extract here or to folder. And 7zip is doing that nicely.

Jackson__

PeaZip is popular? It seems a lot less tested than 7zip; Last time I tried to use it, it failed to unpack an archive because the password had a quote character or something like that. Never had such crazy issues in 7zip myself.

Gormo

> .. but 7-zip has a pretty terrible GUI?

Since you're asking, the answer is no. 7-Zip has an efficient and elegant UI.

delfinom

I would never trust PeaZip.

The author updates code in the github repo....by drag and drop file uploads. https://github.com/peazip/PeaZip/commits/sources/

yapyap

if by gui u mean the ability to right click a .zip file and unzip it just through the little window that pops up ur totally right. At least that + the unzipping progress bar is what I appreciate 7zip for

quickthrowman

I use the right click context menu to run 7zip, why would you open a GUI?

quietbritishjim

That is a GUI!

sammy2255

abhinavk

https://github.com/M2Team/NanaZip

It includes the above patches as well as few QoL features.

d33

Thanks! Any ideas why it didn't get merged? Clearly 7-Zip has some development activity going on and so does this fork...

Beretta_Vexee

Working with Igor Pavlov, the creator of 7-zip, does not seem very straightforward (understatement).

Tuldok

7-zip's development is very cathedral. Igor Pavlov doesn't look like he accepts contributions from the public.

jccalhoun

Since Windows 11 incorporated libarchive back in October 2023 there is less reason to use 7-zip on windows. I would be surprised if any of my friends even know what a zip file is let alone zstd.

rs186

If you ever try to extract an archive file of several gigabyte size with hundreds of thousands of files (I know, it's rare), the built-in one is as slow as a turtle compared to 7z.

m-schuetz

Being a bit faster or efficient won't make most people switch. 7z offers great UX (convenient GUI and support for many formats) that keeps people around.

rat9988

If anything, the gui and ux is terrible compared to winrar.

marcellus23

Why are they not adopting ztsd?

ninjis

I had initially migrated to NanaZip, but with Windows natively supporting the 7z format now, I'm not sure it's needing anymore.

avidiax

Why was there a limitation on Windows? I can't find any such limit for Linux.

monocasa

A lot of synchronization primitives in the NT kernel are based on a register width bit mask of a CPU set, so each collection of 64 hardware threads on 64 bit systems kind of runs in its own instance of the scheduler. It's also unfortunately part of the driver ABI since these ops were implemented as macros and inline functions.

Because of that, transitioning a software thread to another processor group is a manual process that has to be managed by user space.

zik

Wow. That's surprisingly lame.

Const-me

The NT kernel dates back to 1993. Computers didn’t exceed 64 logical processors per system until around 2014. And doing it back then required a ridiculously expensive server with 8 Intel CPUs.

The technical decision Microsoft made initially worked well for over two decades. I don’t think it was lame; I believe it was a solid choice back then.

lmm

Seems like this is a general Windows thing per https://learn.microsoft.com/en-us/windows/win32/procthread/p... - applications that want to run on more than 64 CPUs need to be written with dedicated support for doing so.

dwattttt

The linked Processor Group documentation also says:

> Applications that do not call any functions that use processor affinity masks or processor numbers will operate correctly on all systems, regardless of the number of processors.

I suspect the limitation 7zip encountered was in how it checked how many logical processors a system has, to determine how many threads to spawn. GetActiveProcessorCount can tell you how many logical processors are on the system if you pass ALL_PROCESSOR_GROUPS, but that API was only added in Windows 7 (that said, that was more than 15 years ago, they probably could've found a moment to add and test a conditional call to it).

dspillett

It isn't just detecting the extra logical processors, you have to do work to utilise them. From the linked text:

"If there are more than one processor group in Windows (on systems with more than 64 cpu threads), 7-Zip distributes running CPU threads across different processor groups."

The OS does not do that for you under Windows. Other OSs handle that many cores differently.

> more than 15 years ago, they probably could've found a moment to add and test a conditional call to it

I suspect it hasn't been an issue much at all until recently. Any single block of data worth spinning up that many threads for compressing is going to be very large, you don't want to split something into too small chunks for compression or you lose some benefit of the dynamic compression dictionary (sharing that between threads would add a lot of inter-thread coordination work, killing any performance gain even if the threads are running local enough on the CPU to share cache). Compression is not an inherently parallelizable task, at least not “embarrassingly” so like some processes.

Even when you do have something to compress that would benefit for more than 64 separate tasks in theory, unless it is all in RAM (or on an incredibly quick & low latency drive/array) the process is likely to be IO starved long before it is compute starved, when you have that much compute resource to hand.

Recent improvements in storage options & CPUs (and the bandwidth between them) have presumably pushed the occurrences of this being worthwhile (outside of artificial tests) from “practically zero” to “near zero, but it happens”, hence the change has been made.

Note that two or more 7-zip instances working on different data could always use more than 64 threads between them, if enough cores to make that useful were available.

dwattttt

Are you sure that if you don't attempt to set any affinities, Windows won't schedule 64+ threads over other processor groups? I don't have any system handy that'll produce more than 64 logical processors to test this, but I'd be surprised if Windows' scheduler won't distribute a process's threads over other processor groups if you exceed the number of cores in the group it launches into.

The referenced text suggests applications will "work", but that isn't really explicit.

Dylan16807

That depends on what format you're using. Zip compresses every file separately. Bzip and zstd have pretty small maximum block sizes and gzip doesn't gain much from large blocks anyway. And even when you're making large blocks, you can dump a lot of parallelism into searching for repeat data.

lofties

Windows has a concept of processor groups, that can have up to 64 (hardware) threads. I assume they updated 7zip to support multiple processor groups.

null

[deleted]

xxs

WaitForMultipleObjects is limited to 64... since forever.

silon42

Maybe WaitForMultipleObjects limit of 64 (MAXIMUM_WAIT_OBJECTS) applies?

An ugly limitation on an API that initially looks superior to Linux equivalents.

whalesalad

Windows is a terrible operating system.

izzydata

This may or may not be a relevant question, but does the terminology of "zip" have the same origin as the zip disk drive?

malfist

No. Zip format significantly predates the zip disk.

vpShane

[dead]

leecarraher

I've used pbzip2 which takes the same parallel blocked compression approach 7zip seems to be taking (using AI's analysis of the changes). Theoretically the compression is less efficient, but i haven't noticed a difference in practice.

aquir

7-zip is one of the software that I miss since I’ve moved to macOS

portaltonowhere

Keka is also really nice!

https://www.keka.io/

aquir

Never heard of it, I'll give it a try!

MYEUHD

If you're talking about the program you use in the terminal, you can install it via homebrew

immibis

No, the GUI. 7-zip integrates well with the shell: select a group of files, right click -> make zip file, and so on. Or right-click a zip file and select extract. If you're accustomed to Linux you might not know what they're talking about.

TortoiseGit (and TortoiseSVN) are similarly convenient. Right click a folder with an SVN repo checked out, and select "SVN update". Right-click an empty space, and select "SVN checkout". SVN was the main distribution method for some modding communities before things like Steam Workshop and Github, specifically because TortoiseSVN made it so convenient. Checkout into your addons folder, and periodically update. What could be simpler?

DeepSeaTortoise

How about PeaZip?

aquir

I've used PeaZip in the past but only on Windows, I was not aware that a MacOS version exists! I'll give it a try. Cheers

lihaciudaniel

7zip has been the greatest usage for limbo x86 on mobile.

You just termux qemu-utils convert your qcow2 partitions to IMG and 7zip can read IMG file

Try yourself to see you can extract from your emulated windows

ltbarcly3

Wow, a program that doesn't matter anymore has been very very minimally enhanced on a platform that doesn't matter anymore, benefitting the 7 users that have more than 64 real cores with Windoes and are regularly compressing archives so large that it doesn't drastically reduce the compression ratio to split it into more thsn 64 sections.

Posting this link to hn has consumed more human potential than the thing it is describing will save up to the end of time.

tobinc

A 1% speed improvement for 1% of 7zip users is several times more productive than your comment.