Skip to content(if available)orjump to list(if available)

Evaluating modular RAG with reasoning models

anonymousDan

Is RAG any good for coding tasks?

mkesper

Latency must be brutal here. This will not be possible for any chat application, I guess.

bauefi

It depends on how you do retrieval. If you just use dense embeddings for example you can get the latency of one search query down to maybe something like 400ms. In that case multiple sequential look ups would be ok but your embeddings need to be good enough of course

emil_sorensen

Yep even with a small bump in performance (which we only saw for a subset of coding questions), it wouldn't be worth the huge latency penalty. Though that will surely go down over time.

emil_sorensen

Curious if anyone else has run similar experiments?