Web Information Gathering
Overview
Web Application Information Gathering is a specialized phase of reconnaissance that focuses on web applications and their underlying technologies. Unlike infrastructure enumeration, this phase targets the application layer to identify technologies, frameworks, hidden files, parameters, and potential attack vectors.
This guide is organized into two comprehensive sections:
π Table of Contents
Comprehensive guide to discovering subdomains and DNS infrastructure
Topics Covered:
Manual DNS Enumeration - dig, zone transfers, advanced techniques
Automated DNS Tools - dnsenum, fierce, dnsrecon with HTB examples
Advanced Subdomain Discovery - amass, puredns for high-performance enumeration
Passive Discovery - Certificate transparency, subfinder, assetfinder
Tool Selection Guide - When to use what tool and performance comparison
Security Considerations - Rate limiting, stealth techniques, defensive measures
Key Tools:
dig- Manual DNS queries and analysisdnsenum- Comprehensive DNS enumeration with zone transfersamass- Advanced subdomain discovery with 30+ data sourcespuredns- High-performance DNS brute-forcing with wildcard filteringsubfinder- Passive subdomain enumerationfierce- User-friendly subdomain scanner
Detailed guide to enumerating web applications and their components
Topics Covered:
Technology Stack Identification - whatweb, Wappalyzer, header analysis
Directory & File Enumeration - gobuster, ffuf, dirb for hidden content
Virtual Host Discovery - Finding additional applications on same server
Parameter Discovery - ffuf, arjun, paramspider for hidden parameters
API Enumeration - REST, GraphQL, OpenAPI documentation discovery
Web Crawling & Spidering - ReconSpider, hakrawler, Burp Suite, OWASP ZAP
Search Engine Discovery - Google dorking, OSINT techniques, automated tools
Web Archives - Wayback Machine, historical analysis, waybackurls, gau
Automated Frameworks - FinalRecon, Recon-ng, theHarvester, SpiderFoot
JavaScript Analysis - LinkFinder, endpoint extraction, sensitive data
CMS-Specific Enumeration - WordPress, Joomla, Drupal specialized tools
Security Analysis - Headers, SSL/TLS, WAF detection and bypass
Key Tools:
whatweb- Technology stack identificationgobuster- Directory and file discoveryffuf- Fast web fuzzing for parameters and vhostswpscan- WordPress security scannerarjun- Parameter discovery toolwafw00f- WAF detection and fingerprinting
π― Quick Start Guide
Phase 1: Subdomain Discovery
# Quick subdomain enumeration
subfinder -d example.com
assetfinder example.com
# Comprehensive active enumeration
amass enum -active -d example.com -brutePhase 2: Technology Identification
# Identify web technologies
whatweb https://example.com
curl -I https://example.comPhase 3: Content Discovery
# Directory enumeration
gobuster dir -u https://example.com -w /usr/share/wordlists/dirb/common.txt
# Parameter discovery
ffuf -u https://example.com/page?FUZZ=value -w parameters.txtPhase 4: Search Engine Discovery
# Google dorking reconnaissance
site:example.com
site:example.com inurl:login
site:example.com filetype:pdf
site:example.com "confidential" OR "internal"Phase 5: Web Archives Analysis
# Historical website analysis
echo "example.com" | waybackurls
gau example.com
# Manual Wayback Machine investigation
https://web.archive.org/web/*/example.comPhase 6: Automated Reconnaissance
# FinalRecon comprehensive scan
./finalrecon.py --full --url http://example.com
# theHarvester OSINT gathering
theHarvester -d example.com -l 500 -b allπ οΈ Essential Tools Summary
DNS
dig
Manual DNS queries
Zone transfers, detailed analysis
DNS
dnsenum
Automated DNS enumeration
Comprehensive reconnaissance
DNS
amass
Advanced subdomain discovery
Maximum coverage with 30+ sources
DNS
puredns
High-performance brute-forcing
Massive wordlist handling
Web
whatweb
Technology detection
Initial reconnaissance
Web
nikto
Web server scanning
Comprehensive security assessment
Web
builtwith
Technology profiling
Detailed technology stack analysis
Web
netcraft
Web security services
Security posture assessment
Web
gobuster
Directory discovery
Finding hidden content
Web
ffuf
Web fuzzing
Parameter/vhost discovery
Web
reconspider
Custom web crawling
HTB Academy reconnaissance
Web
hakrawler
Web crawling
Content discovery
Web
google dorking
OSINT reconnaissance
Search engine discovery
Web
wayback machine
Web archives
Historical website analysis
Web
waybackurls
Archive URL extraction
Historical endpoint discovery
Web
finalrecon
Automated framework
All-in-one Python reconnaissance
Web
recon-ng
Modular framework
Database-driven reconnaissance
Web
theharvester
OSINT gathering
Email, subdomain, employee discovery
Web
wpscan
WordPress security
CMS-specific testing
π HTB Academy Integration
Both guides include practical HTB Academy lab examples with:
Real-world reconnaissance scenarios
Command-line examples with expected outputs
Step-by-step methodology for CPTS exam preparation
Analysis of results and next steps
π Security Considerations
Rate Limiting & Stealth
Use passive enumeration when possible
Implement delays between requests
Distribute queries across multiple DNS servers
Monitor for detection and blocking
Legal & Ethical
Obtain proper authorization before testing
Respect rate limits and server resources
Follow responsible disclosure practices
Document all reconnaissance activities
π Learning Path
Start with Subdomain Enumeration to understand DNS infrastructure
Progress to Web Application Enumeration for application-level discovery
Practice with HTB Academy labs for hands-on experience
Combine techniques for comprehensive reconnaissance methodology
π WHOIS Information Gathering
Basic WHOIS Lookup
# Basic WHOIS query
whois example.com
# Extract key information
whois example.com | grep -E "(Registrar|Creation Date|Registry Expiry|Updated Date)"
# Name servers
whois example.com | grep -i "name server"
# Contact information
whois example.com | grep -E "(Registrant|Admin|Tech)" -A 5Intelligence Extraction
# Extract email addresses
whois example.com | grep -oE '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
# Check domain age
whois example.com | grep -i "creation date"
# Privacy protection detection
whois example.com | grep -iE "(whoisguard|privacy|proxy|domains by proxy)"Key Information to Extract:
Domain registration details and timeline
Registrant contact information
Name server configuration
Domain age and transfer history
Privacy protection status
π References
HTB Academy: Information Gathering - Web Edition
OWASP Web Security Testing Guide
RFC 1034, 1035: Domain Names - Concepts and Facilities
SecLists: https://github.com/danielmiessler/SecLists
Burp Suite Documentation
FFUF Documentation: https://github.com/ffuf/ffuf
Amass Documentation: https://github.com/OWASP/Amass
π Next Steps
After completing web information gathering:
Infrastructure Enumeration - Port scanning and service detection
Vulnerability Assessment - Identify specific security weaknesses
Exploitation Planning - Develop attack vectors based on findings
Reporting - Document discoveries and recommendations
Related CPTS Guides:
Last updated