How to Block Bots from Apache with WHM/cPanel
Automated bots crawling websites are a common challenge for website owners and administrators. While some bots-like Googlebot or Bingbot-help index your site and drive traffic, many others can cause harm. Malicious bots may scrape your content, attempt brute-force attacks, consume excessive bandwidth, or skew your analytics data.
If you manage your hosting through WHM/cPanel and use Apache as your web server, there are several effective ways to block unwanted bots and protect your site. This guide from Go4hosting will walk you through the process of identifying, blocking, and managing bots using WHM/cPanel and Apache configurations.
Why Block Bots?
Before diving into methods, it's important to understand why bot blocking matters:
Reduce server load: Bots can generate high traffic that consumes CPU, memory, and bandwidth.
Protect sensitive areas: Bots may try to brute-force login pages or scrape private content.
Improve analytics accuracy: Filtering bots ensures your visitor statistics reflect real users.
Prevent spam: Bots can post spam comments or forms if not properly controlled.
Understanding Bots: Good vs. Bad
Good bots: Search engines like Google, Bing, Yahoo, and social media crawlers.
Bad bots: Scrapers, spam bots, vulnerability scanners, brute-force attackers.
Good bots generally identify themselves with legitimate user agents and IP addresses. Bad bots may disguise themselves or come from suspicious IP ranges.
Methods to Block Bots in Apache with WHM/cPanel
1. Using cPanel's IP Blocker
The simplest method to block bots is to block their IP addresses using cPanel:
Log into cPanel for your domain.
Navigate to Security > IP Blocker.
Enter the IP addresses or IP ranges you want to block (e.g., 192.168.1.1 or 192.168.1.0/24).
Click Add.
Limitations:
IP blocking is effective but manual and reactive.
Bots often use many IPs or change IP addresses.
Blocking large IP ranges can affect legitimate users.
2. Using .htaccess to Block Based on User-Agent
Most bots send a User-Agent string identifying themselves. You can block bots by user agent via Apache's .htaccess file:
Connect to your site's root directory using FTP or File Manager in cPanel.
Edit or create a .htaccess file.
Add the following snippet:
apache
CopyEdit
RewriteEngine On
# Block bad bots by user agent
RewriteCond %{HTTP_USER_AGENT} (bot1|bot2|scraper|badbot) [NC]
RewriteRule .* - [F,L]
Replace bot1|bot2|scraper|badbot with actual bot names or keywords from user agents you want to block. Example:
apache
CopyEdit
RewriteCond %{HTTP_USER_AGENT} (AhrefsBot|MJ12bot|SemrushBot|DotBot) [NC]
This returns a 403 Forbidden response to matching bots.
3. Blocking Bots via WHM's Apache Configuration Include Files
For server-wide blocking on a WHM-managed server, you can add custom Apache directives:
Log into WHM as root.
Navigate to Service Configuration > Apache Configuration > Include Editor.
Under Pre Main Include, select your Apache version.
Add custom rules such as:
apache
CopyEdit
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (AhrefsBot|MJ12bot|SemrushBot|DotBot) [NC]
RewriteRule .* - [F,L]
This approach blocks bots at the server level for all domains hosted on the server.
4. Using CSF Firewall to Block Bad Bots
If your WHM server has ConfigServer Security & Firewall (CSF) installed, you can block bad bot IPs directly via firewall:
bash
CopyEdit
nano /etc/csf/csf.deny
bash
CopyEdit
csf -r
Blocking bots at the firewall level is highly effective as it prevents them from even reaching Apache.
5. Leveraging ModSecurity Rules to Block Bots
WHM/cPanel often includes ModSecurity, a web application firewall (WAF). You can:
Example ModSecurity rule blocking user agents:
apache
CopyEdit
SecRule REQUEST_HEADERS:User-Agent "AhrefsBot|MJ12bot|SemrushBot|DotBot" "id:123456,phase:1,deny,status:403,msg:'Blocked bad bot user agent'"
Rules like this give you granular control over requests.
6. Blocking via robots.txt (Less Effective)
You can disallow bots via robots.txt by specifying:
makefile
CopyEdit
User-agent: BadBot
Disallow: /
However, malicious bots often ignore robots.txt. This method is better suited for well-behaved bots.
How to Identify Bad Bots
Check Access Logs
Review your Apache access logs (usually at /etc/apache2/logs/access_log or /usr/local/apache/logs/access_log) to spot suspicious user agents or IPs.
Example command to see frequent bots:
bash
CopyEdit
cat access_log | awk '{print $12}' | sort | uniq -c | sort -nr | head -20
Use Tools
Best Practices for Bot Management
Whitelist good bots like Googlebot to avoid blocking search engine indexing.
Regularly update your blocked bot list as new bots emerge.
Use a multi-layered approach: firewall + Apache rules + ModSecurity.
Monitor logs for false positives to avoid blocking legitimate users.
Educate your team about the difference between good and bad bots.
Summary
Method | Use Case | Level | Ease of Use | Effectiveness |
cPanel IP Blocker | Simple IP block | Domain level | Easy | Moderate |
.htaccess User-Agent Blocking | Block by user agent | Domain level | Moderate | Moderate |
WHM Apache Include Editor | Server-wide user-agent block | Server level | Advanced | High |
CSF Firewall | IP-based blocking | Server level | Advanced | Very High |
ModSecurity Rules | Granular WAF control | Server level | Advanced | Very High |
robots.txt | Polite bot instructions | Domain level | Easy | Low (for bad bots) |
Conclusion
Blocking unwanted bots is crucial to maintain your website's performance, security, and integrity. With Go4hosting's WHM/cPanel managed Apache Tomcat servers, you have multiple options to block bad bots ranging from simple IP blocking in cPanel to advanced firewall and ModSecurity rules at the server level.
By combining these methods and regularly monitoring your server traffic, you can significantly reduce the impact of malicious bots and protect your hosting environment.
If you need help implementing bot blocking or want to learn about managed security services, contact Go4hosting support for expert assistance.Was this answer helpful?
0
0