Skip to content(if available)orjump to list(if available)

Httptap: View HTTP/HTTPS requests made by any Linux program

yoavm

The "How it was made" section of the README was not less interesting than the tool itself:

> The way we have set things up is that we live and practice together on a bit over a hundred acres of land. In the mornings and evenings we chant and meditate together, and for about one week out of every month we run and participate in a meditation retreat. The rest of the time we work together on everything from caring for the land, maintaining the buildings, cooking, cleaning, planning, fundraising, and for the past few years developing software together.

abraae

Reminds me of a quote from "Soul of a new machine":

> During one period, when the microcode and logic were glitching at the nanosecond level, one of the overworked engineers departed the company, leaving behind a note on his terminal as his letter of resignation: "I am going to a commune in Vermont and will deal with no unit of time shorter than a season."

why_at

Great quote, although the nitpicky part of my brain immediately thought "They must have days though?"

aitchnyu

In The Inner Citadel, in the section of living in the present, the author says there is a "thin" moment separating past and future and a thick moment by meaningfulness. If a thin/technical moment is 1/44.1kHz, a thick moment is a note of music. A current answer to the meaning of life. This person is not about the day to day tensions.

sitkack

The day washes over you, but this person only needs to "deal with" the harvest.

alexflint

Wow that's an incredible quote! It feels like that to me too.

erdii

To be honest: This sounds like just another of the many many other yoga/spiritual cults that currently exist all over the western world.

EDIT: typos and slight wording changes

yoavm

I believe I grew up in a cult myself, and one of the things I've concluded from that experience, and from leaving it, is that everywhere is a cult. Humans have a tendency towards cult-ish life, and if the cult is big enough we just refer to it as "society". People were as afraid (more or less) to leave the cult I was at, as people are around me now when they consider doing anything that is out of the norm.

By no mean am I trying to hint towards some conspiracy, or to say that all cults are equally bad (or good); Just to say that sometimes the word cult simply means "a less popular way of life than the one most people around me live by".

bityard

A "cult" is a rather specific kind of organization. The typical hallmarks are non-mainstream spiritual beliefs, highly controlling and exploitative leadership, and rules against interacting with outsiders. Non-conformity generally results in outsized (sometimes violent) punishment and shame.

Under this definition, for example, Catholic nuns are decidedly not a cult. They know what they are in for when the join, and may leave the convent any time they wish. Most Amish communities are _probably_ not cults. I am undecided about Mormons but leaning towards maybe.

I don't know what kind of cult you grew up in (and you have my empathy if it was painful) but "society" by definition cannot be a cult.

nine_k

Isn't it funny how the very word "culture" is sort of related to "cult".

mhss

My understanding is that the definition of cult requires a common object of devotion. What's that object of devotion for "society"? it's too large and diverse of a group to categorize it as such IMHO. I agree however that sometimes people will categorize anything strongly deviating from the norm as cult-ish.

quesera

There is absolutely nothing in their README to suggest that you are using the word "cult" properly.

MisterTea

Did you visit their website? https://www.monasticacademy.org/

While I cannot judge them outright, their article "Cyborgs Need a Trustworthy Religion" can appear cultist as they try to intertwine technology and religion.

jonahx

Their video has a cultish vibe. Not necessarily of the dangerous variety, but there seemed to be a lot of shared jargon and groupthink under the umbrella of "freeing your mind."

maybehewasright

https://www.youtube.com/watch?v=5It1zarINv0&pp=ygUOa2diIGFnZ... Former KGB Agent Yuri Bezmenov Explains How to Brainwash a Nation (Full Length)

antics9

It’s a Buddhist monastery.

xg15

> For the past few years we have been recording a lecture series called Buddhism for AI. It's about our efforts to design a religion (yes, a religion) based on Buddhism for consumption directly by AI systems. We actually feel this is very important work given the world situation.

I think it's an indicator of just how weird the times we're currently living in really are, that this part actually makes perfect sense...

(whether or not it's a good idea or will lead to the results they envision is another question)

RajT88

There was a cool Korean movie which featured a robot Buddha, "Doomsday Book".

taurknaut

You'd think that the people willing to talk to a chatbot would not be willing to discuss the self with any honesty, but I'm continually surprised by the world.

RajT88

I have a friend who has mental health issues thanks to what life has thrown at her.

ChatGPT gives out surprisingly solid advice and feedback. It is a bad look that ChatGPT is more emotionally intelligent than her friends.

2030ai

I sadly assumed the first countryside photo was generated but I assume now it is real!

The mix of tech and meditation would appeal to me. Maybe the idea does (actually doing it is probably hard!).

It seems like a "Buddhist Recurse"

alexflint

Yeah that photo is real! That's where I live!

Yes, it's true, actually doing it is hard, but to be honest not as hard as a lot of other stuff (getting a phd for example, or goodness gracious buying a house in San Francisco). I love getting up early. I love living out in nature. I love chanting and eating meals together and making a version of Buddhism for AI systems!

If you're interested in what it's like, we have written a bunch of very short few-paragraph stories about our time at MAPLE here: https://tales.monasticacademy.org/

Silasdev

This seems like the kind of things you can do before you get kids and have real responsibilities. Then you need to get back to reality. Sounds fun though and I would have liked to experience it.

alexflint

httptap is a process-scoped http tracer that you can run without root priveleges. You can run `httptap <command>` where <command> is a linux program and you get a trace of http/https requests and responses in standard output:

    httptap -- python -c "import requests; requests.get('https://monasticacademy.org')"
    ---> GET https://monasticacademy.org/
    <--- 308 https://monasticacademy.org/ (15 bytes)
    ---> GET https://www.monasticacademy.org/
    <--- 200 https://www.monasticacademy.org/ (5796 bytes)
It works by running <command> in an isolated network namespace. It has its own TCP/IP stack (for which it uses gVisor). It is not an HTTP proxy and so does not rely on <command> being configured to use an HTTP proxy. It decrypts TLS traffic by generating a CA on the fly. It won't install any iptables rules or make other global system changes.

maxmcd

Do you know if it's possible to get this working on macos? I believe Tailscale uses gvisor's tcp/ip lib (as their netstack lib) on macos for certain things.

mdaniel

Does Darwin have network namespaces like the Linux kernel does? I get the impression that's an important component of this approach

maxmcd

Yes, good point, maybe that is the blocker.

gear54rus

can it modify requests or responses? with the current web getting increasingly user-hostile a need for tool like this was never more apparent

especially if it doesn't require proxy configuration

dspillett

> especially if it doesn't require proxy configuration

It does require trusting a local CA, or apps away from the browser being configured not to validate CAs (or trust the new CA) if they don't push responsibility for that to the OS-level support.

I'm not sure it would be a good idea for the non-technical public: teaching them how to setup trust for a custom CA and that it is sometimes a good thing to do, would lead to a new exploit route/tool for phishers and other black-hats because many users are too naively trusting or too convenience focussed to be appropriately careful. How many times have we seen people install spyware because of claims that it will remove spyware? It could also be abused by malicious ISPs, or be forced on other ISPs by governments “thinking of the children”.

gear54rus

> How many times have we seen people install spyware because of claims that it will remove spyware?

That is the kind of example that completely disproves your point. How many times do we have to fall into 'just lock everything down for safety' pit and end up with being forced to look at even more ads as a result before we learn?

The only way to be safe is to be informed, 'just works' doesn't exist. Don't trust anyone but yourself.

alexflint

Agreed! So there isn't any interface for modifying requests/responses at present, but it's definitely possible given the underlying approach. If you consider [this line of code](https://github.com/monasticacademy/httptap/blob/main/http.go...) where you have an HTTP request parsed from the <command> that ran and are about to send it out to the public internet: you could modify the request (or the response that is received a few lines further) in just the way that you would modify a normal http.Request in Go.

_boffin_

Injecting random data into telemetry requests to mess up someone’s pretty dashboard?

knome

if the program doesn't pin certificates, you should be able to intercept them by telling your machine to trust a certificate authority of your own creation and performing a mitm attack on the process's traffic. if it does do certificate pinning, then it won't trust your home issued cert, and will refuse to send data through your proxy.

pcpuser

You might find mitmproxy useful.

alexflint

Yep, mitmproxy is fantastic IMO.

wutwutwat

Did everyone forget about wireshark, which can totally be ran as non-root?

https://blog.wireshark.org/2010/02/running-wireshark-as-you/

lights0123

It certainly doesn't provide automated, process-scoped HTTPS interception.

boobsbr

It's still more setup than just installing this tool.

Also, can Wireshark/libpcap decrypt SSL/TLS traffic this easily?

graerg

Not in my experience; I think I gave up and opted for mitmproxy which works but is not this easy/seamless.

alexflint

Wireshark is awesome but yeah as others mentioned it's the TLS decryption piece that is difficult in that workflow

wzyboy

It's a genius idea to run the process in a isolated network namespace!

I'm more interested in the HTTPS part. I see that it sets some common environment variables [1] to instruct the program to use the CA bundle in the temporary directory. This seems to pose a similar issue like all the variants of `http_proxy`: the program may simply choose to ignore the variable.

I see it also mounts an overlay fs for `/etc/resolv.conf` [2]. Does it help if httptap mounts `/etc/ca-certificates` directory with the temporary CA bundle?

[1] https://github.com/monasticacademy/httptap/blob/cb92ee3acfb2...

[2] https://github.com/monasticacademy/httptap/blob/cb92ee3acfb2...

alexflint

Thanks! But yep I agree, you're exactly right, it's ultimately... frustrating that there isn't really an agreed-upon or system-enforced way to specify CA roots to an arbitrary process.

It's true that httptap mounts an overlay on /etc/resolv.conf. This is, as you'd expect, due to the also-sort-of-frustrating situation with respect to DNS resolution in which, like CA roots, there isn't a truly reliable way to tell an arbitrary process what DNS server to use, but /etc/resolv.conf is a pretty good bet. As soon as you put a process into a network namespace you have to provide it with DNS resolution because it can no longer access localhost:53, which is the systemd resolver, which is the most common setup now on desktop linux systems.

I do think it might help to mount /etc/ca-certificates as an overlay. When I started looking into the structure of that directory I was kind of dismayed... it's incredibly inconsistent from one distro to the next. Still, it's doable. Interested in any knowledge you might be able to share about how to add a cert to that directory in a way that would be picked up by at least some TLS implementations.

xorcist

It's a bit thin solution though, isn't it? As you say, it's dependent on both specific CA store and resolver behaviour. It's probably going to be robust enough on the most common SSL libraries, such as OpenSSL. But if we're going that route, why not just run the software against a patched SSL library which dumps the traffic?

That also doesn't require any elevated privileges (as opposed to other methods of syscall interception) and is likely much easier to do. It has the added benefit of being robust against applications either pinning certificates outright or just being particular about serial numbers, client certificates, and anything like that.

0x696C6961

> why not just run the software against a patched SSL library which dumps the traffic?

Why run strace when you can just patch libc?

arjvik

What if instead you bound your own DNS server to localhost:53 inside the network namespace? I suppose you'd still have to mess with /etc/resolv.conf in case it points to hardcoded public resolvers instead like mine does.

adtac

IMO there's no general solution to the HTTPS part that will work for all kinds of programs and the long tail of certificate pinning implementations.

As a proof by counterexample, imagine malware that uses TLS for communication and goes to great lengths to obfuscate its compiled code. It could be a program that bundles a fixed set of CA certificates into its binary and never open any files on the filesystem. It can still create valid, secure TLS connections (at least for ~10 years or so, until most root CA certificates expire). TLS is all userspace and there's no guarantee that it uses OpenSSL (or any other common library), so you can't rely on hooking into specific OpenSSL functions either. If the server uses a self-signed certificate and the client accepts it for whatever reason, it's worse.

With that said, it's definitely possible to handle 99% of the cases reliably with some work. That's better than nothing.

adtac

Using a TUN device for this is a really cool idea! And the "How it was made" section is one of the best things I've read in a Github README.

I'm building something called Subtrace [1] but it can intercept both incoming and outgoing requests automatically. Looks like we converged on the same interface for starting the program too lol [2]. Subtrace's purpose is kinda different from httptap's though (more observability / monitoring for cloud backend services, hence the emphasis on both incoming and outgoing). Also, it uses a different approach -- using Seccomp BPF to intercept the socket, connect, listen, accept, and ~10 other syscalls, all TCP connections get proxied through Subtrace. We then parse the HTTP requests out of the TCP stream and then show it to the user in the Chrome DevTools Network tab, which we repurposed to work in the browser like a regular webapp.

Any fun stories there from running programs under httptap? Who phones home the most?

[1] https://github.com/subtrace/subtrace

[2] https://docs.subtrace.dev/quickstart

afarah1

Reminds me of NetGuard, which uses Android's VPN service (instead of raw TUN) for packet filtering. https://github.com/M66B/NetGuard

alexflint

Wow, did not know about this!

alexflint

Super cool! Connecting what you capture to Chrome DevTools is fascinating, as is using eBPF. Great work getting the devtools to run as a standalone web app. You won't believe it but I have a half-finished attempt of the same thing for the firefox network tab - in the "networktab" dir of the repo!

Very cool project, would love to learn more and happy to chat more about it.

adtac

Thanks! Subtrace uses BPF, not eBPF :) I think eBPF could be made to work with the same approach, but there's a few differences:

- eBPF requires root privileges or at least CAP_BPF. Subtrace uses seccomp_unotify [1], so it works even in unprivileged environments.

- eBPF requires using eBPF maps as the data channel + weird restrictions in the code because of the eBPF verifier. IMO these two things make it way harder to work with for the kind of networking logic that both httptap and Subtrace have in userspace. Everything is perfectly possible, just harder to reason about and debug.

>half-finished attempt of the same thing for the firefox network tab

Hahahah this is incredible. Something something great minds.

[1] https://man.archlinux.org/man/seccomp_unotify.2.en

eriksjolund

Another tool that can be used by an unprivileged user for analysing network traffic is rootless Podman with Pasta.

Just add the podman run option

--network=pasta:--pcap,myfile.pcap

Pasta then records the network traffic into a PCAP file that could later be analysed.

I wrote a simple example where I used tshark to analyse the recorded PCAP file https://github.com/eriksjolund/podman-networking-docs?tab=re...

alexflint

Very good to know about. But you still have the problem of decrypting TLS traffic.

mdaniel

I don't know if it's a standard but I believe a lot of tls libraries honor the SSLKEYLOGFILE env-var https://wiki.wireshark.org/TLS#:~:text=and%20curl%20when-,th...

2030ai

That seems like an unnecessary vulnerability waiting to happen.

henvic

Quite interesting! I've written a library that does something similar ("tap") for a Go application: https://github.com/henvic/httpretty https://asciinema.org/a/297429

I also thought about doing something like this for any program, but never really investigated how to do it. Nice to see someone out there created it :)

ranger_danger

Why not use eBPF instead? Then you could see all http requests from all processes at once, including ones that are already running. Plus you wouldn't need to bother with TLS at all, just hook on e.g. write(2).

adtac

How would hooking on write(2) solve TLS? You'll be able to read and modify the ciphertext, but the process will never call write(2) with the plaintext bytes, so you can't actually read the HTTP request. You'll just see the encrypted bytes that go on the wire, but so does the NSA :)

You need the kind of CA certificate trick that httptap uses. It comes with its own set of caveats (e.g. certificate pinning), but it can be made to work reliably in most practical scenarios.

I've spent an unjustifiable amount of time thinking about this specific problem building Subtrace [1], so I'm genuinely very interested in a simpler / more elegant approach.

[1] https://github.com/subtrace/subtrace

jeroenhd

I believe that's how https://github.com/gojue/ecapture works. I don't know the details, but it seems to work!

ddelnano

Yep, that's correct. It uses eBPF upprobes to attach to the SSL_write/SSL_read functions.

ranger_danger

My understanding is that typically a TLS library provides a socket interface for the application to write() to, which can be intercepted by an eBPF program.

alexflint

Unfortunately TLS happens inside the the application, not in the kernel, so using eBPF to hook syscalls to write won't help with TLS decryption.

dgl

It is quite simple to use eBPF with uprobes to hook library calls, for example: https://github.com/iovisor/bcc/blob/master/tools/sslsniff.py

The downside is this doesn't work with anything not using OpenSSL, there are projects like https://github.com/gojue/ecapture which have interceptors for many common libraries, but the downside is that needs different code for each library.

I think providing a TLS certificate is fine for the use cases of the tool; most tools won't be doing certificate pinning, but ecapture does support Android where this is more likely.

ranger_danger

But read and write syscalls are used by the application to do I/O on the sockets before/after the encryption, which can be intercepted. Or you can attach uprobes directly to the TLS library's own functions.

ARob109

Using uprobes to hook the SSL library, would it be possible to filter content by inspecting and modifying eg the decrypted HTTP response ?

ranger_danger

absolutely

farnulfo

eBPF TLS tracing: The Past, Present and Future https://blog.px.dev/ebpf-tls-tracing-past-present-future/

somanyphotons

Presumably eBPF requires root privs?

trallnag

I'm having a hard time coming up with a use case where I want to use a tool like that but I'm also lacking root privileges

freedomben

Inside most production environments. I could use this today inside a Pod that isn't allowed root privs.

null

[deleted]

TacticalCoder

Wouldn't this require root? A big "selling point" of httptap seems to be that precisely it doesn't require root.

Anyway the more options we have, the better.

freedomben

Neat! This will immediately be used by me to debug nginx configs. Currently I use curl -v and have to manually skim the output to figure out what's wrong, but this would immediately make redirect loops and other things apparent. Cool tool!

alexflint

Very cool! Would love to hear how it goes, especially any features that would be useful in the context of real-world usage.

xyst

Very cool if you need a quick and dirty way to inspect the http/s call stack of an app. Personally prefer eBPF to get _everything_ but using this utility can help drill down what is important in the eBPF trace

sevg

This looks great!

The GitHub profile points to https://www.monasticacademy.org/about which I have no particular opinion on but it did leave me wondering what the connection is between their monastic training retreat and their projects on GitHub.

Edit: Oh, I didn’t go to the bottom of the readme https://github.com/monasticacademy/httptap?tab=readme-ov-fil...

alexflint

Yeah, for other readers who are looking at this thread, the connection is just that this (httptap) is a Monastic Academy project, and what that means is that there is a group of people living on 123 acres in Vermont according to a fairly traditional Buddhist monastic structure (though we are not ordained monks), and during the day we work on a number of technology and non-technology projects together. The link to the readme that sevg posted above is a good overview:

https://github.com/monasticacademy/httptap?tab=readme-ov-fil...

notepad0x90

I really like their approach. other methods that might use something like LD_PRELOAD fail on statically linked ELF's, like golang binaries.

chanux

This is amazing! I have settled on MITMProxy after looking around for something.

My MITMProxy flow, if anyone is interested: https://gist.github.com/chanux/e87bd91ea2d4a76cb0b872ff79699...