CI Lab — Aditya Goyal

How I was spending my time

As a strategy and operations lead, competitive intelligence is part of my core responsibility. I need to keep leadership informed on what's happening across multiple industry verticals.

Every week I was running Google searches across those verticals, reading through everything I found, filtering for relevance, and writing it up for the team. Most of my time went to collection and curation. The strategic analysis, the part that actually mattered, got whatever was left.

If a story broke on a day I wasn't searching, I missed it. Coverage was spotty and there was no way to scale it.

I started exploring how AI could handle the collection and filtering so I could focus on the analysis. After a few iterations, this is what I landed on.

See it in action

Live demo →

Two approaches that didn't work

Attempt 1: Web scraper with AI scoring

Built a scraper that pulled stories and used AI to score them for relevance. Sources were unreliable and quality was impossible to control. The AI had no way to distinguish signal from noise without trusted editorial curation upstream.

Attempt 2: Full orchestration layer

Semantic search, automated analysis, brief generation, all in one run. Shared it with my VP. The sources didn't hold up under scrutiny. We couldn't ask follow-up questions. There was no mechanism for it to improve over time.

Two different architectures, same fundamental problem: I was trying to fully hand off judgment to AI, and it wasn't ready for that.

Finding the middle ground

I started by mapping out what humans are good at vs. what AI is good at in this workflow:

Let humans handle

• Deciding what matters (likes/dislikes)
• Asking follow-up questions
• Strategic framing for leadership

Let AI handle

• Collecting stories from trusted sources
• Scoring relevance against user preferences
• Writing briefs in the format leadership reads

Trusted publications handle editorial curation through their own process. Users signal what matters through simple likes and dislikes. AI scores, filters, and produces output in a format that's already useful.

What makes it interesting

ML scoring without API costs

Embedding-based cosine similarity against user preference centroids in pgvector. Free, unlimited, no rate limits. This is the first scoring layer that filters thousands of stories before any LLM call happens.

Forced rejection to fix AI sycophancy

Version 1 of the curation prompt scored every story as a 4 or 5. The LLM defaulted to justifying everything. The fix was structural: the model must produce a rejection list with per-story reasoning. Forced to argue against inclusion, the scores actually differentiated. Same principle as "argue against your own position" in debate.

Research depth on demand

Users can go deeper on any story, ask follow-up questions, and get context. The human decides what's worth investigating. AI does the investigation.

Strategic framing built into the output

Briefs connect stories to strategic implications: what it means for competitive positioning, what questions leadership should be asking, what to watch next. That translation is encoded in vertical-specific guide files that capture how the organization thinks about each domain.

How the pipeline works

1,000+

stories/day

RSS Feeds

Trusted publications

→

↓

~100

scored & ranked

ML Pipeline

Embedding similarity

→

↓

~10

curated stories

AI Curation

Forced rejection

→

↓

Final brief

ready for leadership

Human Review

Strategic lens

← AI handles volume (70%) Human handles judgment (30%) →

What I'd do differently

Start with the division of labor question first. Map out what AI should handle vs. what humans should handle before writing any code. I built two full systems before asking that question.

Try it yourself

Health tech version. Same architecture, different vertical.

Open live demo →

Built with React + TypeScript · Supabase + pgvector · Deno Edge Functions · Gemini · Vitest

← Back to Portfolio

What's the Right Balance?