Skip to content(if available)orjump to list(if available)

Show HN: Downloading a folder from a repo using rust

tsylba

You guys using convoluted git commands when a single line of subversion works:

svn checkout https://github.com/tensorflow/tensorflow/trunk/tensorflow/ex...

null

[deleted]

zenmac

Link is broken

pornel

There's hardly any Rust in there. It's shelling out to the git command. This could have been a couple lines of bash.

Actually doing this in Rust with lower-level libraries like gix would have been interesting.

panki27

   $ git clone --no-checkout $URL/repo.git
   $ cd repo/
   $ git sparse-checkout init
   $ git sparse-checkout set subdirectory_i_want
   $ git checkout main

cakoose

Now I'm curious -- is that here a way to do this that avoids downloading any more than strictly necessary?

The command above downloads the whole repo history. You could do a depth=1 to skip the history, but it still downloads the he latest version of the entire repo tree.

craftkiller

You could do a blobless or treeless clone https://github.blog/open-source/git/get-up-to-speed-with-par...

Combined with --depth=1 and the --no-checkout / --sparse-checkout flow that the GP already described.

I just tested on the emacs repo, left column is disk usage of just the `.git` folder inside:

  Shallow clones (depth=1):
  124K: Treeless clone depth=1 with no-checkout
  308K: Blobless clone depth=1 with no-checkout
  12M: Treeless clone depth=1 sparse checkout of "doc" folder
  12M: Blobless clone depth=1 sparse checkout of "doc" folder
  53M: Treeless clone depth=1 non-sparse full checkout
  53M: Blobless clone depth=1 non-sparse full checkout
  53M: Regular clone with depth=1

  Non-shallow clones:
  54M: Treeless clone with no-checkout
  124M: Blobless clone with no-checkout
  65M: Treeless clone sparse checkout of "doc" folder
  135M: Blobless clone sparse checkout of "doc" folder
  107M: Treeless clone with non-sparse full checkout
  177M: Blobless clone with non-sparse full checkout
  653M: Full regular git clone with no flags
Great tech talk covering some of the newer lesser-known git features: https://www.youtube.com/watch?v=aolI_Rz0ZqY

vient

git-archive downloads only strictly necessary files but is not universally supported

https://git-scm.com/docs/git-archive

archargelod

The rust app just calls a few git commands too[1]

Could've been a shell script[2]

[1] https://github.com/zikani03/git-down/blob/cb2763020edc81e464...

[2] https://textbin.net/ja17q8vga4

vient

Same can be done using git and tar

    mkdir -p <out_dir> && git archive --remote=<remote> --format=tar.gz <branch> <files...> | tar -xzC <out_dir>
Strangely, github does not support this, so tested with bitbucket.

luismedel

So,

  $ git-down -d bootstrap-dist https://github.com/twbs/bootstrap.git:master dist
is *way* better than

  $ git clone --depth 1 https://github.com/twbs/boostrap.git
  $ cd bootstrap
  $ mv ./dist ~/stuff/boostrap-latest 
because: "C'mon, you don't have the time to be doing all that."

And then we wonder how the fuck we end with malware in our systems.

panki27

All that's missing is `curl ... | sudo bash` as install instruction in the README

pharrington

Since this is a "Show HN," did you mean to post using your https://news.ycombinator.com/user?id=zikani_03 account?

koakuma-chan

I have a similar problem with hugging face. I do git clone and it doesn't download models. I know it's supposed to use LFS, but I don't know how to make it work, I tried everything. I had to install their disgusting Python CLI to download a model.

0fflineuser

Don't you just need to install git-lfs https://git-lfs.com/ and then run `git lfs pull` ?

koakuma-chan

Oh I guess I didn't run `git lfs pull` lol thanks