Show HN: Cleverb.ee – open-source agent that writes a cited research report

44 comments

·April 28, 2025

cantaloupe

I browsed the GitHub and website for a bit, but didn’t see any examples! It would be useful to share the output for a common question that can be substantiated with reliable sources, like “Is coffee good for me?”. Even better if you can show a comparison to other deep research tools. From the copy it seems like Cleverbee could pull from more diverse sources (e.g. YouTube) and fewer unreliable sources (e.g. product blogs). Show that off!

nickwatson

Done! https://docs.google.com/document/d/1bGVkI3xaBP1AvRxkB4GKeL0N...

endianswap

lol what

> Reference to failed web browser attempt for Rush University

nickwatson

Good spot! I already am on the case with that one.

The Rush University website takes a long time to load (check it out), and the script recognized the article had loaded but something on the website was causing it to hang (waiting for networkidle status) so it terminated the parsing early and worked with the content it had.

So in a nutshell, it still parsed but with a warning/error that this happened.

I'm optimizing this now to exclude the wording from the report, or make a note.

nickwatson

Fixed, and pushed to the repo.

If it hits a timeout but the content is still there, it will now return the content (without the error message) to ensure something small like this doesn't get synthesized.

That's the thing about sytnthesizing and I feel one of the strengths of this system, is the LLM doesn't put too much of it's own "creativity" (which usually comes with hallucination) on to the top of the findings!

TL;DR: Fixed.

NitpickLawyer

> Reference to failed web browser attempt for Rush University

Leslie, I typed your symptoms into the thing up here, and it says you could have network connectivity problems.

nickwatson

I think that's a very good idea, thanks!

I'm actually working on "The Beehive" at the moment, where the app can push the research to a hive on the website, so people can share their research/discoveries.

My client's paid work takes priority, but I hope to do it over the course of this week.

P.s. Running report now for you, "Is coffee good for me?" to show you this example ;)

nickwatson

Thanks for all the thoughtful feedback today, everyone.

I’m logging the ideas (grounding, source ranking, etc.) and will open issues tonight.

Heading offline now but I’ll circle back tomorrow. Feel free to keep the questions coming!

smallnix

How do you do the citing? Reverse-RAG post processing?

nickwatson

Good question — it's pretty straightforward right now:

I pass the collected content chunks (with their original URLs attached) into Gemini 2.5 Pro, asking it to synthesize a balanced report and to inline citations throughout.

So it's not doing anything fancy like dynamic retrieval or classic RAG architecture.

Basically: - The agent gathers sources (webpages, PDFs, Reddit, etc.) - Summarises each as it goes (using a cheaper model) - Then hands a bundle of summarised + raw content to Gemini 2.5 Pro - Gemini 2.5 Pro writes the final report, embedding links directly as citations with [1], [2], etc style citations throughout.

Reverse-RAG is something I for sure want to implement. Once I can afford a better computer to run this with at scale. Even an 8B model will take overnight to summarize an average piece of content for me right now! But I'm also keeping an eye on the pace of which AI moves in the larger LLM space. The size and abilities of likes of Gemini 2.5 Pro context windows are pretty crazy these days!

Thanks for the question.

iamandoni

Do you take any measures to prevent link hallucination? And content grounding / attribution verification?

nickwatson

At the moment the measures taken are:

- Full content analysis by Primary LLM (Default is Gemini 2.5 Pro) with link hard-coded alongside each piece of content with structured output for better parsing. - Temperature right down (0.2), strict instructions to synthesize, precise prompts to attribute links exactly and without modification.

What I hope to introduce:

- Hard-coded parsing of links mentioned in final report to verify with the link map created throughout the research journey - Optional, "double-checking" LLM review of synthesized content to ensure no drift. - RAG enhancements for token-efficient verification and subsequent user questions (post-research)

Do you have any further suggestions?

Right now I hope to strike the delicate balance between token efficiency, with enhanced grounding as optional settings in the future. I have a big task list of things, and this is one of them. I will ensure to re-prioritize alongside user requests for the different features.

Of course, being open source, contributions are highly welcome. I would love to see large community involvement. Collaboration benefits everyone.

P.s. I have spent hundreds of dollars in tests. I'd say for every 1 hour of building, about 3 hours of testing have gone into this, debugging, optimizing quality, ensuring guard-rails are in place.

If you go to the repo, also check out the config/prompts.py file - it will give you a little more insight into what is going on (there are code checks as well, but generally it gives you an idea).

nickwatson

Hi HN

I built *cleverb.ee* to solve my own research pain: too many tabs, too much messy sourcing. Gemini and OpenAI deep research tools weren't getting the balanced/unbiased quality I desired.

*What it does*: • Reads webpages, PDFs, Reddit posts, PubMed abstracts, YouTube transcripts. • Plans with Gemini 2.5 Pro, acts with Gemini 2.5 Flash, summarises with Gemini 2.0 Flash (or you can use any Local LLM or Claude) • Outputs a fact-checked, cited Markdown report with live token usage tracking.

*Tech*: Python + Playwright + LangChain with MCP tool support. AGPL-3.0 licensed.

*Why open source?*: I wanted full transparency at every agent step and easy pluggable data sources.

Quick install:

```bash git clone https://github.com/SureScaleAI/cleverbee cd cleverbee && bash setup.sh

Would love feedback — especially what critical research sources you’d want integrated next!

kleiba

> Gemini and OpenAI deep research tools weren't getting the balanced/unbiased quality I desired.

Could you elaborate, please?

nickwatson

I felt they would just "cast the net wide" with a quick search-collect at scale, then load in the LLMs own training on to the top, and reports I generated were giving me hallucinated content.

I wanted something more - collect-evaluate-decide loop to iterate through discoveries and actively seek out diverse sources.

Quanttek

Can you specify ? The default heavy reliance on Reddit and YouTube, rather than trusted publications (e.g. Scientific American, NYTimes) and scientific publications, is worrying given widespread misinformation in certain scientific fields (e.g. nutrition, health, economics)

nickwatson

I never said "heavy reliance" on Reddit/YouTube. It actually is requested to use discernment to recognize poor, or biased sources and opinions, and label them as such (see the example report on Coffee which I shared previously in another comment).

Most the time it has only sought out one or two post/youtube videos as it can recognize the low credibility value.

It comes loaded with a PubMed MCP tool and the beauty of it being open source is you can exclude or limit the sources as much as you want, or add in new sources - that's why I wanted to open it up, to allow for critique over methodologies and allow for improved, balanced research from experts.

It is also requested to evaluate the source and whether or not they have "some benefit to gain" from the article, to ensure it balances this into the research, also.

devmor

This is not research. This is a search engine.

dackdel

this looks useful!!!!

semi-extrinsic

I think it's really unfortunate that this type of thing gets called "research". I get that it fits with what has unfortunately become modern day usage - "Karen did her own research before becoming a flat-earther" - but I really wish the AI companies would've had better faith in their future solutions than to call this research.

There's gotta be quite a few actual researchers at these companies who are shaking their heads.

To spare others the lookup, here's from the Oxford dictionary. Emphasis on the word "new":

  To study a subject in detail, especially in order to discover new information or reach a new understanding.
 

  Example: They are carrying out/conducting/doing some fascinating research into/on the language of dolphins.

nickwatson

I understand what you're saying. I believe any kind of research will nearly always begin with learning and understanding of the knowledge that is already out there.

Almost every subject has been learned this way, whether at school from a teacher or text-book, or reading papers.

The Oxford dictionary definition says the same, "to study a subject in detail". This is what AI is doing - I see it as a "power suit" for distilling information much faster, without the cognitive bias that many of us will carry.

Learning is an important part of research, and this must come with discernment over credibility of existing research, including identifying where the gaps are. This kind of critical thinking allows for another level, experiments, surveys, etc to uncover things even further.

If you were to study the language of dolphins today, where would you start? Would you jump into the ocean and start trying to talk with them, or would you look up what is already discovered? Would you study their behaviors, patterns, etc?

What drove me to do this project is exactly the example you mentioned, the flat-earther type who look up an article on some kind of free hosting website or Sandra from accounts social media page and taken as the be-all-and-end-all of knowledge. It comes without bias recognition or critical thinking skills. This is where I'm hopeful to level the playing field, and ensure unbiased, balanced information is uncovered.

latexr

> without the cognitive bias that many of us will carry.

It is naive and incorrect to believe LLMs do not have biases. Of course they do, they are all trained on biased content. There are plenty of articles on the subject.

> Would you jump into the ocean and start trying to talk with them, or would you look up what is already discovered?

Why resort to straw men arguments? Of course anyone would start by looking up what has already been discovered, that doesn’t immediately mean reaching for and blindly trusting any random LLM. The first thing you should do, in fact, is figure out which prior research is important and reliable. There are too many studies out there which are obviously subpar or outright lies.

nickwatson

I agree, LLMs have biases. It was my primary desire to build this tool and put the weight on the LLMs to synthesize rather than think about and interpret the subjects. It's actually the main goal of this tool - maybe I don't articulate that as well as I could be - I'm open to suggestions here!

I agree to first figuring out which research is most important and reliable. There is a planning stage, to consider the sources and which ones hold credibility.

In addition, the user has full control over the sources the tool uses, and even add their own (MCP tools).

In addition, being open source, you have full control over the flow/prompts/source methods/etc and as a result can optimize this yourself and even contribute improvements to ensure this benefits research as a whole.

I welcome your feedback, and any code amendments you propose to improve the tool. You clearly understand what makes good research and your contributions will be highly valued by all of us.

vidarh

A more precise term for what it is doing would be a "literature review".

But I think you're right to describe it as research in the headline, because a lot of people will relate more to that term. But perhaps describe it as conducting a literature review further down.

nickwatson

I agree. In all honesty I was just following on the trend that has been popularized by OpenAI/Google so it is more relatable but will mention the "literature review" as you suggest, it's a good idea.

I didn't give the wording too much thought in all honesty - was just excited to share.

Where would you suggest to put the literature review text? Readme.md?

What about something like "synthesized findings from sources across the internet" or something like that.

When I see the word literature, I immediately think of books.

sReinwald

I really have to challenge the notion of AI "distilling information without cognitive bias.

First, AI systems absolutely embody cognitive biases - they're just different from human ones. These systems inherit biases from:

  - Their training data (which reflects human biases and knowledge cutoffs)
  - Architectural decisions made by engineers  
  - Optimization criteria and reinforcement learning objectives  
  - The specific prompting and context provided by users

An AI doesn't independently evaluate source credibility or apply domain expertise - it synthesizes patterns from its training data according to its programming.

Second: You frame AI as a "power suit" for distilling information faster. While speed has its place, a core value of doing research isn't just arriving at a final summary. It's the process of engaging with a vast, often messy, diversity of information, facts, opinions, and even flawed arguments. Grappling with that breadth, identifying conflicting viewpoints, and synthesizing them _yourself_ is where deep understanding and critical thinking are truly built.

Skipping straight to the "distilled information," as useful as it might be for some tasks, feels like reading an incredibly abridged version of Lord of the Rings: A small man finds a powerful ring once owned by an evil God, makes some friends and ends up destroying the ring in a volcano. The end. You miss all the nuance, context, and struggle that creates real meaning and understanding.

Following on from that, you suggest that this AI-driven distillation then "allows for another level, experiments, surveys, etc to uncover things even further." I'd argue the opposite is more likely. These tools are bypassing the very cognitive effort that develops critical thinking in the first place. The essential practice for building those skills involves precisely the tasks these tools aim to automate: navigating contradictory information, assessing source reliability, weighing arguments, and constructing a reasoned conclusion yourself. By offloading this fundamental intellectual work, we remove the necessary exercise. We're unfortunately already seeing glimpses of this, with people resorting to shortcuts like asking "@Grok is this true???" on Twitter instead of engaging critically with the information presented to them.

Tools like this might offer helpful starting points or quick summaries, but they can't replicate the cognitive and critical thinking benefits of the research journey itself. They aren't a substitute for the human mind actively wrestling with information to achieve genuine understanding, which is the foundation required before one can effectively design meaningful experiments or surveys.

nickwatson

Very true, and it got me thinking a lot.

As humans, we align to our experiences and values, all of which are very diverse and nuanced. Reminds me of a friend who loves any medical conspiracy theory, whose dad was a bit of an ass to him, and of course, a scientist!

Without our cognitive biases, are we truly human? Our values; our desired outcomes inherently are part of what shapes us. and of course the sources we choose to trust reinforce this.

It's this that makes me think AGI can never be achieved, or human-like ability for AI to think, because we are all biased, like it or not. Collectively and through challenging each other, this is what makes society thrive.

I feel there is no true path towards a single source of truth, but collaboratively we can at least work towards getting there as closely as possible

gnuly

this case should simply be called search, no?

to me research takes a long time, and not just an hour or so.

pcthrowaway

Well the models often do an initial search and then a follow-up search. So it's a re-search

nickwatson

It does it several times, so maybe re-re-re-search works?

kleiba

You overlook the fact that the system integrates various sources into a coherent report. This in my opinion makes it more than just mere search.

yard2010

New is a matter of perspective.

null

[deleted]