500 Emails a Day.Zero Manual Review.
A digital agency receiving 500 inbound emails per day was sorting, researching, and triaging them by hand. We built a pipeline that reads the inbox, deduplicates against their contact history, enriches every sender with third-party data, scores each lead, and delivers a ranked list every morning. The team stopped touching the inbox.
Every problem. Directly solved.
The team owned nothing, controlled nothing, and touched everything. Here's what changed.
The system runs on infrastructure the client owns. All credentials, API keys, and data stay in their environment. No third-party automation platform in the middle. If we part ways, nothing stops working.
What We Built
Eight components covering the full pipeline from inbox capture to scored lead delivery. Each runs independently and logs every action for a searchable history.
Gmail Spam Folder Monitor
Automated polling of the Gmail spam folder via the Gmail API. New inbound emails are captured and queued without any manual review or folder sorting.
Email Parser & Classifier
Extracts sender name, domain, subject line, and message body. Classifies each email by type and intent so only relevant records move through the enrichment pipeline.
Deduplication Engine
Cross-references each inbound domain against a running contact database. Removes repeat contacts, known vendors, and previously engaged domains before any enrichment runs.
Ahrefs Enrichment
For every unique domain that clears deduplication, the system pulls Domain Rating, organic traffic estimate, and referring domain count directly from the Ahrefs API.
Contact Data Enrichment
Appends contact name, verified email, and company details from Hunter and Apollo for each sender, creating a complete lead record without any manual lookups.
Opportunity Scoring
Each lead is scored on a composite of domain authority, traffic tier, relevance to existing verticals, and outreach quality. High-value targets surface automatically.
Daily Digest Delivery
A ranked list of qualified opportunities is delivered each morning via Slack and email. The team sees only what cleared scoring thresholds. No inbox review required.
Contact Database & History
Every processed lead is logged with full enrichment data, classification, score, and status. The team can search, filter, and track outreach outcomes over time.
How It Works Under the Hood
The system runs on the client's own server. All credentials and API keys are stored in their environment. It operates without CalTech Web involvement.
Email Intelligence Pipeline
Python scripts running on a cloud server the client owns. All Gmail OAuth tokens, API keys, and database credentials are stored in the client environment.
Spam folder is checked every 15 minutes. New emails trigger an enrichment job immediately. Daily digest runs on a scheduled cron at 7:00 AM local time.
Ahrefs API for domain metrics. Hunter and Apollo for contact data. Internal SQLite database for deduplication history and lead tracking.
All email data, enrichment results, and contact records stay in the client's environment. No data leaves through third-party automation platforms.
6 Weeks from Kickoff to Full Go-Live
Audit current spam review process. Document scoring criteria and existing contact database. Provision server and configure Gmail API OAuth.
Build email parser, classifier, and deduplication logic. Connect to existing contact database. Run first test batch against historical spam backlog.
Integrate Ahrefs API and Hunter/Apollo connectors. Build scoring algorithm with the client's input. Validate output quality against manual benchmarks.
Build Slack and email digest. Run automated system alongside manual review for one full week. Team compares outputs and surfaces scoring adjustments.
Full deployment. Team switches from manual spam review to digest-only workflow. Written SOPs delivered for managing thresholds and scoring rules.
Score threshold tuning as outreach patterns shift. Additional data source integrations. CRM sync or outreach automation as next-phase scope.
This Engagement
One-time build fee to deploy the full pipeline, followed by a 3-month retainer for tuning, enrichment source updates, and expansion.
500 emails per day at 10 minutes each is 83 hours of manual work eliminated every single day. At a conservative $25/hr, that's over $2,000 in labor recovered daily.
Is This Right for Your Team?
This build is a strong fit for any agency or team that receives high volumes of inbound email across a single channel and wastes time sorting signal from noise before any real work begins.
- 100+ inbound emails per week from a consistent source
- Manual lookup step eating into productive time
- Repeated outreach from the same contacts or domains
- Team decisions based on domain quality metrics
- Existing contact list or CRM to check against
Want to Build Something Like This?
Every build starts with a 30-minute scoping call. Tell us the workflow, we'll tell you if it's a fit.
