Middleware, proxies and automated attack scanners

The public internet is noisy.

The moment you put a real domain online, bots will try paths your app has never heard of.

They do not care that your app is Next.js, Astro, Laravel, Rails or a vibe-coded SaaS built with an AI agent. They will still request WordPress installers, leaked environment files, Git metadata, backup folders, debug logs and random executable files.

That is not paranoia. That is normal background radiation on the web.

A real 24-hour example

While writing this lesson, PageLens AI had a 24-hour warning-log sample from production.

In that sample, 46 of the 50 warning logs were security blocks from automated probes. The other warnings were unrelated runtime noise.

The blocked requests included patterns like:

GET /wp-admin/install.php?step=1
POST /xmlrpc.php
GET /.env
GET /.env.local
GET /.env.production
GET /.git/config
GET /.git/HEAD
GET /wp-config.php.bak
GET /debug.log
GET /composer.json
GET /backup
GET /jame.bat

PageLens is not a WordPress site.

That did not matter. Bots still looked for WordPress setup pages and WordPress config backups.

PageLens does not publish .env files.

That did not matter either. Bots still tried .env, .env.local, .env.production, .env.staging, .env.backup and similar variants.

This is the lesson: attackers and scanners do not inspect your tech stack politely before trying old, common, profitable paths.

What middleware or proxy is for

Middleware, proxy rules or edge routing checks let you reject obvious bad requests before your app route does any real work.

In a Next.js app, that might be middleware.ts or proxy.ts, depending on your version and routing setup.

On other platforms, the same idea might live in:

Vercel middleware or route handling
Cloudflare WAF rules
Netlify edge functions
reverse proxy rules
server middleware in Express, Fastify or Hono
framework-level route guards

The exact file matters less than the responsibility:

Request -> cheap public-internet filter -> route handler -> business logic

You want the cheap filter to catch requests that should never reach business logic.

What to block early

Start with paths that are never legitimate for your app.

Common examples:

WordPress probes: /wp-admin, /wp-login.php, /wp-config
XML-RPC probes: /xmlrpc.php
environment files: /.env, /.env.local, /.env.production
Git metadata: /.git, /.git/config, /.git/HEAD
backup paths: /backup, /backups, /db.sql, /dump.sql
dependency manifests you do not serve: /composer.json, /package-lock.json
logs and archives: .log, .sql, .bak, .old, .zip, .tar.gz
executable scripts you never serve: .php, .bat, .cmd, .sh

Do not blindly copy a giant deny-list from the internet. Add rules that match your app and hosting setup.

For example, if your product legitimately serves downloadable .zip files, do not block all .zip paths globally.

Why not just let them 404?

Sometimes a plain 404 is fine.

But middleware-level blocking gives you a few useful benefits:

route handlers do not waste compute on obvious junk
suspicious requests can be logged with a consistent reason
repeated probes from one source can trigger a temporary ban
risky paths can be hidden behind the same control layer
production logs become easier to analyse by category

The goal is not to make your app "secure" through a deny-list.

The goal is to remove obvious garbage before it touches expensive, sensitive or confusing parts of the app.

Middleware is not a substitute for auth

This is important.

Do not treat middleware as your only security boundary.

Middleware is useful for:

rejecting impossible paths
redirecting canonical domains
applying coarse access checks
adding security headers
rate limiting obvious abuse
logging suspicious probes

But route handlers still need their own protections:

authentication
authorization
ownership checks
input validation
rate limits for expensive actions
safe response shapes
server-only secrets

If an API route reads private data, the route should verify the user and the user's right to that data. Do not rely on "middleware should have stopped this".

Middleware is a front gate. Your route code still needs locks on the doors.

What good logging looks like

When a request is blocked, log enough to investigate patterns without leaking sensitive data.

Useful fields:

timestamp
request method
path
reason
source IP or hashed IP, depending on your privacy posture
user agent if useful
deployment or environment
response status

Avoid logging:

full authorization headers
cookies
tokens
request bodies with personal data
query strings that may contain secrets

Good blocked-request logs tell a story like:

[security] blocked GET /.env.local — blocked path prefix: /.env
[security] blocked POST /xmlrpc.php — blocked path prefix: /xmlrpc.php
[security] IP banned after repeated probes — last reason: blocked path prefix: /.env

That is enough to see the pattern without publishing secrets into your logs.

Add repeat-probe controls

One bad request might be random.

Ten bad requests in a burst is a signal.

A practical middleware layer can track repeated probes and temporarily block the source. The rule does not need to be clever at first.

Example policy:

If an IP hits 10 blocked security paths inside 10 minutes,
return 404 or 403 for future requests for a short window.

This should be conservative. Do not ban real users for normal navigation mistakes.

Use it for paths no real visitor should ever hit: /.env, /.git/config, /wp-config.php.bak, /xmlrpc.php.

How to inspect your app

Search for your routing layer:

rg "middleware|proxy|matcher|NextResponse|request\\.nextUrl"

Search for existing block rules:

rg "wp-admin|xmlrpc|\\.env|\\.git|backup|wp-config|blocked path"

Then ask your agent:

Review the middleware/proxy layer in this app.

Find:
- paths that should be blocked before route handlers run
- sensitive paths that should never be public
- route matchers that accidentally skip important paths
- logging that may expose secrets
- places where middleware is being treated as the only auth boundary

Use production-log examples where available. Propose the smallest safe changes and explain how to verify them after deployment.

What to verify after deployment

After adding or changing middleware rules, test both blocked and legitimate paths.

Blocked examples:

curl -i https://your-domain.com/.env
curl -i https://your-domain.com/.git/config
curl -i https://your-domain.com/wp-admin/install.php
curl -i -X POST https://your-domain.com/xmlrpc.php

Legitimate examples:

curl -I https://your-domain.com/
curl -I https://your-domain.com/pricing
curl -I https://your-domain.com/api/health

You are checking two things:

obvious scanner paths get blocked cheaply
real pages and API routes still work

Where PageLens fits

PageLens cannot see every private middleware rule in your repo from the outside.

But it can help you audit the public surface:

security headers
exposed public routes
suspicious crawlable paths
metadata and trust signals
authenticated route setup
AI-agent repair output

Pair that external view with production logs and code review.

Use the logs to see what scanners are trying.

Use middleware or proxy rules to block the obvious junk.

Use route-level auth and validation to protect real data.

Use PageLens to check the site your users and bots can actually reach.

Related lessons

Next, read: