PromptHub
Cybersecurity Open Source Tools

I Deployed a Dark Web Scanner in 5 Minutes Here's What It Found

B

Bright Coding

Author

15 min read
12 views
I Deployed a Dark Web Scanner in 5 Minutes Here's What It Found

I Deployed a Dark Web Scanner in 5 Minutes β€” Here's What It Found

What if your brand is being sold on the dark web right now? Most security teams discover breaches months too late. By the time someone spots stolen credentials on a paste site or your customer database in a ransomware leak, the damage is done. The average time to identify a breach? 277 days. That's almost nine months of silent exploitation.

Here's the painful truth: enterprise threat intelligence platforms cost $50,000 to $500,000 annually. They lock you into proprietary data feeds, bury you in alerts, and still miss the Southeast Asian threat landscape entirely. If you're defending organizations in the Philippines, Indonesia, or Singapore, you're flying blind with tools designed for Silicon Valley.

But what if you could deploy a military-grade dark web scanner for free? One that crawls .onion networks, monitors Telegram channels where ransomware groups coordinate, tracks 324+ ransomware gangs, and delivers a daily intelligence digest β€” all from a single Docker command?

Meet osintph/darkweb-scanner. Built by OSINT PH for the Southeast Asian security community, this open-source platform is the secret weapon that top threat hunters are quietly deploying. And I'm about to show you exactly how to wield it.


What is osintph/darkweb-scanner?

osintph/darkweb-scanner is a self-hosted, open-source cyber threat intelligence platform engineered specifically for the Philippine and Southeast Asian security landscape. Created by OSINT PH, a collective of regional security researchers, this tool democratizes access to capabilities that were previously exclusive to Fortune 500 SOCs and government agencies.

At its core, the platform is an async Tor-based crawler married to a comprehensive intelligence dashboard. But calling it a "scanner" undersells its power. This is a full threat intelligence operating system: it profiles threat actors, monitors ransomware victim feeds, performs infrastructure reconnaissance, scrapes Telegram channels for criminal coordination, and correlates indicators of compromise from multiple sources β€” all automated, all auditable, all under your control.

Why it's trending now:

  • Ransomware attacks in Southeast Asia surged 78% in 2024, yet regional context remains absent from Western threat intel platforms
  • The ransomware.live PRO integration unlocks enterprise-grade data (324+ groups, 26,000+ victims) at zero cost
  • Self-hosting eliminates data sovereignty concerns β€” critical for APAC compliance regimes
  • The one-command deployment removes the traditional barrier: you no longer need a dedicated DevSecOps team to run threat intel

Version 1.0.1 ships under AGPL v3, meaning the community owns its evolution. No vendor lock-in. No hidden API costs. No data feeding competitive intelligence back to a SaaS provider.


Key Features That Make This Insane

The feature density of this platform punches decades above its weight class. Here's what separates it from toy scanners and overpriced enterprise dashboards:

πŸ•ΈοΈ Dark Web Crawler β€” An async Tor-based engine that doesn't just check if .onion sites are "up." It performs configurable keyword monitoring with real-time alerting, meaning you can watch for your executive names, product codenames, or customer databases as they appear in criminal marketplaces. The async architecture means hundreds of concurrent connections without blocking.

πŸ“Š Intelligence Dashboard β€” This isn't a static report. The new start page delivers live threat level scoring, ransomware victim feeds with SEA regional highlighting, group rankings by activity, Southeast Asian country breakdowns, a ThreatFox IOC mini-feed, and curated press headlines. It's the morning briefing every CISO needs but most can't afford.

πŸ”₯ ransomware.live PRO Integration β€” This is where it gets wild. The free PRO API unlocks 324+ tracked ransomware groups, 26,000+ documented victims, extracted IOCs, negotiation chat logs, actual ransom notes, YARA rules for detection, SEC 8-K filing alerts (for publicly traded victim notifications), and a CSIRT directory. Three thousand API calls per day. Forever free. This single integration destroys the value proposition of commercial ransomware intelligence feeds.

πŸ›‘οΈ IOC Feed β€” Live indicators from ThreatFox, URLhaus, and Feodo Tracker with search, type filtering (IP, domain, hash, URL), and confidence scoring. No more pivoting between five browser tabs to verify if that IP is actually malicious.

πŸ“± Channel Monitor β€” The interactive dashboard tab that lets you scrape any public Telegram channel on demand β€” no CLI required. Auto-translates messages to English, downloads media, packages everything into a ZIP with HTML report. Ransomware groups coordinate openly on Telegram. Now you can watch them in real-time.

πŸ” Infrastructure Recon β€” This module alone justifies the deployment. Passive + active DNS recon with DNSDumpster enrichment, active subdomain brute-forcing, TCP port scanning across 30 services, HTTP directory enumeration, certificate transparency history, zone transfer attempts, SPF/DMARC/DKIM email security scoring, interactive subdomain node graphs, per-IP port heatmaps, and PDF export with world map visualization. Penetration testers charge $10,000+ for this scope.

🧰 OSINT Toolkit β€” Seven proxied tools accessible from the dashboard: Shodan, Censys, GreyNoise, URLScan, MXToolbox, SecurityTrails, VirusTotal. No API key management hell. No tab switching. Unified interface.

πŸ“§ Daily Digest β€” Automated morning email with CISA KEV catalog updates, OTX pulses, abuse.ch feeds, and curated RSS β€” delivered via Mailgun. Your threat briefing, automated.


4 Brutal Real-World Use Cases

1. Brand Exposure Monitoring for Financial Services

A Philippine bank discovers their customer database being traded on a Russian-language dark web forum. The crawler's keyword monitoring hits on the bank's internal project codename. Alert fires within 4 hours of posting. Without this platform? The typical discovery time is 9-14 months, usually via a third-party breach notification or regulatory inquiry.

2. Ransomware Group Pre-Attack Surveillance

Your SOC identifies a ransomware group increasingly targeting Indonesian manufacturing. Through the ransomware.live PRO integration, you extract their YARA rules and IOCs, push them to your EDR, and harden exposed RDP endpoints the group favors. You prevent the attack before the phishing email lands.

3. Telegram-Based Threat Actor Tracking

A threat actor claims responsibility for a DDoS attack on your Singapore datacenter. The Channel Monitor scrapes their Telegram channel, auto-translates claims of future targets, and correlates with your infrastructure recon data showing exposed services. You patch before they escalate.

4. M&A Due Diligence Intelligence

Your firm acquires a Vietnamese logistics company. Before signing, infrastructure recon reveals compromised email security (SPF/DMARC failures), historical subdomain takeovers, and dark web mentions of previous breaches undisclosed in disclosures. You renegotiate or demand remediation.


Step-by-Step Installation & Setup Guide

The deployment philosophy is radical simplicity: one command, zero prerequisites.

Prerequisites

  • Fresh Linux server (Ubuntu 22.04/24.04 recommended)
  • 2GB RAM minimum (Chromium for PDF map rendering needs headroom)
  • Ports 80 and 443 open
  • Domain name (optional but strongly recommended for trusted SSL)

One-Command Deploy

# Basic deployment with self-signed SSL
curl -fsSL https://raw.githubusercontent.com/osintph/darkweb-scanner/main/deploy.sh -o /tmp/deploy.sh && sudo bash /tmp/deploy.sh

This single script performs everything: Docker installation, repository cloning, Tor configuration, secret generation, Nginx setup with SSL, and service orchestration. No manual dependency resolution. No configuration file archaeology.

Production deployment with real domain and Let's Encrypt:

DOMAIN=scanner.yourdomain.com SSL_EMAIL=you@example.com \
  curl -fsSL https://raw.githubusercontent.com/osintph/darkweb-scanner/main/deploy.sh -o /tmp/deploy.sh && \
  sudo bash /tmp/deploy.sh

Critical First Steps

After deployment completes:

  1. Create admin account at https://YOUR_SERVER_IP/register

    Security note: Registration closes automatically after the first account. No accidental open registration.

  2. Configure monitoring targets:

    # Edit keyword rules for dark web monitoring
    nano ~/darkweb-scaner/config/keywords.yaml
    
    # Add .onion seed URLs for crawler initialization
    nano ~/darkweb-scanner/config/seeds.txt
    
    # Set API keys and secrets
    nano ~/darkweb-scanner/.env
    
  3. Apply configuration changes:

    cd ~/darkweb-scanner && docker compose restart dashboard
    

Web Check Setup (Manual)

The Web Check OSINT module requires separate installation:

# Clone the web-check component
cd /root
git clone https://github.com/lissy93/web-check.git
cd web-check && BASE_URL=/ yarn install && yarn build

# Integrate with main platform
cd /root/darkweb-scanner && docker compose up -d webcheck

Important: Update the hardcoded webcheck.osintph.info URL in src/darkweb_scanner/dashboard/templates/index.html to your own subdomain.

Updating

cd ~/darkweb-scanner
git pull                    # Pull latest code
docker compose build --no-cache  # Rebuild containers
docker compose up -d        # Restart services

REAL Code Examples From the Repository

These aren't theoretical examples. These are actual implementations extracted from the osintph/darkweb-scanner codebase, demonstrating how the platform operates under the hood.

Example 1: Zero-Prerequisite Deployment Script

The entire deployment is orchestrated through this curl-pipe pattern:

# Download and execute the deployment orchestrator
curl -fsSL https://raw.githubusercontent.com/osintph/darkweb-scanner/main/deploy.sh -o /tmp/deploy.sh && sudo bash /tmp/deploy.sh

What's happening here: The -fsSL flags ensure silent, fail-fast operation that follows redirects. The script is downloaded to /tmp (volatile, reboot-cleared) and executed with sudo privileges. Inside deploy.sh: Docker CE installation via official repository, Tor daemon configuration with control port authentication, Nginx reverse proxy setup with automatic Let's Encrypt certificate provisioning via certbot, secret generation for Flask sessions and database connections, and finally docker compose up -d to start the full stack. This pattern eliminates the "works on my machine" problem by controlling the entire environment.

Example 2: Telegram Channel Monitor Authentication

First-time Telegram setup requires interactive authentication. The platform embeds this Python script for execution inside the container:

import asyncio
from telethon import TelegramClient
import os
from dotenv import load_dotenv

# Load environment from mounted .env file
load_dotenv('/app/.env')

async def auth():
    # Initialize client with session persistence in mounted volume
    c = TelegramClient(
        '/app/data/channel_monitor/channel_monitor',  # Session path persists across restarts
        int(os.environ['TELEGRAM_API_ID']),            # Your API ID from my.telegram.org
        os.environ['TELEGRAM_API_HASH']                # Your API hash from my.telegram.org
    )
    # Start with phone authentication; prompts for OTP interactively
    await c.start(phone=os.environ['TELEGRAM_PHONE'])
    print('Auth OK:', (await c.get_me()).username)
    await c.disconnect()

# Run the async auth flow
asyncio.run(auth())

Critical implementation detail: The session file at /app/data/channel_monitor/channel_monitor is stored in a Docker volume mount. This means you authenticate once, and the session persists across container restarts, rebuilds, and updates. The Telethon library handles the MTProto protocol directly, avoiding the Bot API limitations β€” this is a full user client capable of accessing any public channel, not just bot-accessible content.

Example 3: Environment Configuration for Intelligence Feeds

The .env file structure demonstrates the platform's feed architecture:

# Core platform security β€” CHANGE THESE
DASHBOARD_SECRET_KEY=your-random-secret-here
TOR_CONTROL_PASSWORD=auto-generated-by-deploy-script
DATABASE_URL=sqlite:///app/data/darkweb_scanner.db  # Or PostgreSQL for production

# ransomware.live PRO β€” free tier unlocks enterprise data
RANSOMWARE_LIVE_API_KEY=your-key-from-my.ransomware.live

# IOC correlation feeds β€” free tiers sufficient for most deployments
THREATFOX_API_KEY=your-threatfox-key
WHITEINTEL_API_KEY=your-whiteintel-key
OTX_API_KEY=your-alienvault-otx-key

# IP reputation and enrichment
ABUSEIPDB_API_KEY=your-abuseipdb-key        # 1,000 checks/day free
VIRUSTOTAL_API_KEY=your-virustotal-key      # 4 requests/min free

# DNS reconnaissance enrichment
DNSDUMPSTER_API_KEY=your-dnsdumpster-key

# Daily digest delivery
MAILGUN_API_KEY=your-mailgun-key
MAILGUN_DOMAIN=mg.yourdomain.com
MAILGUN_FROM=intel@yourdomain.com

# Telegram integration β€” required for Channel Monitor
TELEGRAM_API_ID=12345678
TELEGRAM_API_HASH=abcdef1234567890abcdef1234567890
TELEGRAM_PHONE=+639XXXXXXXXX
TELEGRAM_CHANNELS=channel1,channel2,channel3  # Background monitoring targets

Architecture insight: Every feed is optional. The platform degrades gracefully β€” if you lack VirusTotal keys, IP investigation falls back to AbuseIPDB. No single point of failure. The separation of background scraper channels (TELEGRAM_CHANNELS) from on-demand Channel Monitor usage allows flexible operational security: you might monitor public channels automatically while reserving interactive scraping for incident response.

Example 4: Operational Makefile Commands

Daily operations are streamlined through Make targets:

# Execute immediate dark web crawl (foreground, verbose output)
make scan

# Verify Tor circuit connectivity and exit node rotation
make check-tor

# Display aggregate statistics: URLs crawled, keywords matched, alerts generated
make stats

# Review recent keyword hits with context snippets
make hits

# Stream all container logs with follow (like tail -f for the full stack)
make logs

# Graceful shutdown of all platform services
make stop

Operational pattern: These commands abstract Docker Compose complexity. make scan likely executes docker compose exec dashboard python -m darkweb_scanner.crawler with proper environment injection. The check-tor target is crucial β€” if your Tor circuit is degraded or your exit node is blacklisted by target .onion services, crawls fail silently. This diagnostic prevents wasted cycles.


Advanced Usage & Best Practices

πŸ”’ Security Hardening

  • Never expose the registration endpoint after initial setup. The platform auto-closes it, but verify: grep -r "register" ~/darkweb-scanner/src/darkweb_scanner/dashboard/auth_routes.py
  • Use PostgreSQL for multi-user deployments. SQLite locks under concurrent writes. Set DATABASE_URL=postgresql://user:pass@db/darkweb_scanner.
  • Rotate Tor circuits before sensitive crawls: the control port allows NEWNYM signal injection.

⚑ Performance Optimization

  • The 2GB RAM minimum is real. Chromium headless for PDF map generation can consume 800MB+. Monitor with docker stats.
  • Seed file curation matters. Quality .onion seeds (verified link lists, not dead onions) dramatically improve crawl coverage. The platform doesn't magically find hidden services β€” it needs starting points.
  • Keyword YAML structure supports regex patterns and severity weighting. Prioritize high-confidence, low-noise patterns over broad matching.

🎯 Intelligence Tuning

  • Regional focus: The SEA/PH threat actor profiles are community-curated. Contribute back β€” this is AGPL software.
  • False positive suppression: Corroborate dark web hits with Telegram scraper mentions and IOC feed matches before escalating.
  • Digest timing: Schedule Mailgun delivery for your SOC shift handover, not midnight when nobody reads it.

Comparison with Alternatives

Capability osintph/darkweb-scanner Recorded Future Mandiant Advantage OpenCTI (Open Source)
Cost Free (AGPL) $50K-$500K/year $100K+ Free (Apache 2.0)
Self-hosted βœ… Full control ❌ SaaS only ❌ SaaS only βœ… Yes
Dark web crawling βœ… Native Tor async βœ… Yes βœ… Yes ❌ Requires plugins
Telegram monitoring βœ… Native + on-demand ⚠️ Limited ⚠️ Limited ❌ Not native
Ransomware tracking βœ… 324+ groups via PRO API βœ… Yes βœ… Yes ⚠️ Manual feeds
SEA regional focus βœ… Built-in ❌ Minimal ❌ Minimal ❌ None
Infrastructure recon βœ… DNS + port scan + dir enum ⚠️ Separate product ⚠️ Separate product ❌ Not included
Deployment complexity βœ… One command ❌ Sales process ❌ Sales process ⚠️ Complex setup
Data sovereignty βœ… You own everything ❌ US-hosted ❌ US-hosted βœ… Yes

Verdict: Commercial platforms offer broader threat actor coverage globally. But for Southeast Asian organizations prioritizing regional context, data sovereignty, and cost efficiency, osintph/darkweb-scanner delivers 80% of enterprise capability at 0% of the cost β€” with the 20% gap being exactly the regional intelligence competitors lack.


FAQ

Q: Is crawling .onion sites legal? A: Yes. Accessing publicly available content on Tor is legal in most jurisdictions. The platform performs passive observation, not unauthorized access. Always consult legal counsel for your specific regulatory environment and ensure compliance with local cybercrime statutes.

Q: Can I use this without a domain name? A: Yes, but not recommended. The deploy script generates a self-signed certificate accessible via IP address. Browsers will show security warnings, and some Telegram API features may behave unexpectedly without proper HTTPS.

Q: How does this compare to running Tor Browser manually? A: Night and day. Manual browsing is non-repeatable, non-scalable, and leaves no audit trail. This platform provides automated scheduling, keyword alerting, historical correlation, team collaboration, and structured data export β€” while maintaining operational security through containerized isolation.

Q: What happens when ransomware.live PRO stops being free? A: The platform falls back to local ransomware data (ransomware_data.py) and community contributions. The architecture is designed for graceful degradation. Additionally, AGPL licensing means the community can fork and adapt if API terms change.

Q: Is 2GB RAM really enough for production? A: For small teams (1-5 users), yes. Scale vertically for larger deployments or disable PDF map generation (the Chromium memory hog) by modifying the recon module configuration. PostgreSQL externalization also reduces memory pressure.

Q: Can I monitor private Telegram channels? A: No. The platform respects Telegram's access controls. Only public channels are accessible. For private channel monitoring, you would need legitimate membership and explicit authorization β€” which the platform does not circumvent.

Q: How do I contribute threat intelligence back to the community? A: Submit pull requests for keyword lists, SEA-specific threat actor profiles, or .onion seed discoveries. The AGPL license ensures all improvements remain open. Contact the maintainers through GitHub Issues or OSINT PH.


Conclusion

The threat intelligence market has sold us a lie: that effective dark web monitoring requires six-figure budgets, proprietary black boxes, and Silicon Valley infrastructure. osintph/darkweb-scanner exposes that lie with a single, devastating command.

In five minutes, you can deploy what took enterprise security teams years to assemble: Tor-based dark web crawling, ransomware group tracking across 324+ actors, Telegram criminal coordination monitoring, automated infrastructure reconnaissance, and a daily intelligence digest β€” all self-hosted, auditable, and free forever under AGPL v3.

For Philippine and Southeast Asian security practitioners, this isn't just a tool. It's sovereignty over your threat landscape. No more flying blind with Western-centric feeds that miss regional threat actors. No more budget approvals for intelligence you can't inspect. No more trusting opaque SaaS providers with your most sensitive security data.

The dark web doesn't sleep. Your competitors for that data certainly don't. But now, neither do you.

πŸ‘‰ Deploy your instance today: github.com/osintph/darkweb-scanner

Star the repository. Contribute regional intelligence. Join the community. The next threat actor targeting your organization is already coordinating somewhere β€” make sure you're watching when they do.


Last updated: 2024. Platform version 1.0.1. For deployment issues and feature requests, use the official GitHub Issues.

Comments (0)

Comments are moderated before appearing.

No comments yet. Be the first to share your thoughts!

Support us! β˜•