Migrating 18 WordPress Sites to a New VPS: A Real-World Story of Backups, Bots, Malware, Cloudflare, and One Very Busy AI Agent

There are two types of server migration stories.

The first type sounds clean and professional:

“We migrated all websites to a new VPS, updated DNS, verified uptime, and improved performance.”

Beautiful. Calm. Almost suspicious.

The second type is what actually happens when you move a fleet of real WordPress sites that have lived through years of plugins, theme changes, traffic spikes, abandoned folders, cache layers, suspicious login attempts, old backups, random cron jobs, and the occasional digital cockroach hiding in wp-content.

This is the second type.

Recently, I migrated 18 WordPress sites to a new Netcup VPS. These were not fresh demo sites with one theme, five posts, and a motivational homepage banner. These were real content sites, running real plugins, receiving real traffic, carrying years of accumulated history.

Some were clean. Some were noisy. Some were quietly carrying scars from previous malware incidents. Some had Ezoic-related DNS history. Some had Cloudflare. Some had plugin leftovers. Some had security rules that looked fine until you tested the weird paths bots actually attack.

The migration started as a hosting move.

It became a full server-hardening project.

Then it became a malware-response system.

Then it became a monitoring framework.

And eventually, it became one of the most useful lessons I have had in managing WordPress at scale:

A VPS is not just a place where your websites live. It is an ecosystem. And if you manage multiple WordPress sites, you are not just a blogger or site owner anymore. You are accidentally running infrastructure.

The accidental sysadmin club is real. Membership is free. Tuition is paid in sleep.

The Starting Point: 18 Sites, One New Server, and Too Many Unknowns

The goal seemed simple enough: move 18 WordPress sites from the old hosting environment to a new VPS.

The new server stack was built around CloudPanel, Nginx, PHP-FPM, MariaDB/MySQL, Cloudflare-proxied domains, and regular WordPress installations. Nothing exotic. Nothing unnecessarily over-engineered.

But when you are migrating many sites, the difficulty is not usually the first successful transfer.

The difficulty is consistency.

One site can be moved manually with patience. Eighteen sites require process.

You need to know:

Which domains are migrated
Which databases belong to which site
Which PHP versions are used
Which DNS records are active
Which sites are Cloudflare-proxied
Which sites use third-party integrations
Which plugins are active
Which admin users exist
Which files are suspicious
Which logs are useful
Which changes are safe to automate
Which changes must never be automated blindly

That last point matters.

A server migration is not a “copy files, import database, pray to the cache gods” operation. Not when the sites make money. Not when organic traffic is involved. Not when one bad .htaccess, Nginx rule, plugin activation, or database option can quietly ruin a site.

So instead of treating it as one big migration, we treated it as a repeatable operational project.

Every site needed to pass a basic checklist:

Files migrated.
Database imported.
wp-config.php verified.
Domain pointed correctly.
Homepage returns HTTP 200.
wp-login.php accessible.
wp-admin redirects correctly.
No database connection errors.
No obvious malware remnants.
No broken plugin paths.
No stale migration artifacts left behind.
Logs available for investigation.
Security rules tested after changes.

That sounds boring.

Boring is good.

In server administration, “boring” means you might actually survive the week.

Enter Hermes: Not a Magic Wand, More Like a Tireless Junior Sysadmin With a Clipboard

A big part of this project was assisted by my Hermes agent.

I do not mean Hermes magically “fixed the server.” That would be the wrong lesson.

The real value of Hermes was different.

Hermes acted like a tireless operations assistant that could:

Read previous context
Follow checklists
Compare before-and-after states
Generate reports
Suggest safe next steps
Keep track of what had been changed
Avoid repeating the same investigation from scratch
Turn messy terminal output into structured decisions
Build reusable scripts
Create rollback plans
Help separate “urgent” from “interesting but later”

That last one is underrated.

When you manage a VPS with many WordPress sites, everything feels urgent. A suspicious file, a 403, a 499 log entry, a plugin folder, an unknown admin user, a DNS mismatch, a Cloudflare timeout, a bot attack — each one wants attention.

Without structure, you jump from fire to fire.

With Hermes, the workflow became more disciplined:

Observe first.
Backup before change.
Patch one thing.
Verify.
Document.
Move to the next issue.

That rhythm saved the project from becoming chaos.

Hermes was most useful not because it was “AI,” but because it helped enforce process when the human brain wanted to panic-click everything.

The First Big Lesson: Migration Reveals Problems You Already Had

A lot of site owners think migration creates problems.

Sometimes it does.

But often, migration reveals problems that were already there.

Old servers hide issues beautifully. Cache hides broken layouts. Plugins hide damaged settings. Cloudflare hides origin behavior. Old backups hide malware history. WordPress itself is very good at limping along while pretending everything is fine.

During the migration, we had to distinguish between three types of issues:

Problems caused by the migration.
Problems that already existed but became visible after migration.
Problems that were unrelated but discovered during the audit.

This distinction matters because the fix is different.

For example, if a migrated site looks broken on the new server, the lazy conclusion is:

“The migration failed.”

But that is not always true.

The old live site may only look fine because of page cache. The current database may already contain damaged theme options, missing plugin settings, or malware-modified content. When you move the database to a clean server and bypass old cache, you may finally see the actual state.

That happened during the broader troubleshooting process.

Instead of blindly restoring old backups — which could reintroduce malware — the smarter move was to compare dynamic state, cached state, database options, plugin status, and old server behavior.

In short:

Do not assume the old site is healthy just because the homepage looks okay.

The homepage is a charming liar.

The Malware Chapter: WPCode, Insert Headers, and the Case of the Returning Snippet

The most dramatic part of the project was not the migration itself.

It was malware containment.

A few sites had signs of WordPress-level compromise involving WPCode / Insert Headers style snippet injection. These plugins are legitimate tools when used properly. The problem is that if an attacker gets admin access, snippet plugins become very convenient places to store malicious code.

At first, it is tempting to assume malware means server-level compromise.

Cron job? Rootkit? Hidden PHP process? Compromised SSH?

Those are serious possibilities, and they must be checked.

But in this case, the evidence pointed elsewhere.

Access logs showed a pattern:

Successful WordPress login.
Access to plugin/admin areas.
Plugin activation or interaction.
Malicious snippet creation or modification.

That changed the diagnosis.

The reinfection mechanism was not primarily “the server keeps spawning malware.”

It was closer to:

Compromised WordPress admin session or credentials → plugin/snippet access → malicious code saved through the admin interface.

That distinction matters enormously.

If you treat an admin-level WordPress compromise as a server-level infection, you waste time scanning the wrong layer.

If you only delete the malicious snippet but leave admin sessions alive, the attacker can come back through the front door wearing the same shoes.

The containment process became:

Backup database and key files before changes.
Snapshot active plugins.
Snapshot administrator users.
Remove malicious snippet records.
Deactivate and remove WPCode / Insert Headers where needed.
Move suspicious plugin folders outside loadable plugin paths.
Rotate WordPress salts.
Clear administrator session tokens.
Reset administrator passwords.
Enforce DISALLOW_FILE_EDIT.
In sensitive cases, enforce DISALLOW_FILE_MODS.
Verify no active plugin references remain.
Verify no published malicious snippets remain.
Verify frontend health.
Verify wp-login.php and wp-admin behavior.
Preserve evidence for later review.

The salt rotation and session invalidation were especially important.

A password reset alone is not enough if old sessions remain valid. Attackers do not politely log out after you change the lock.

The Blocksy Webshell Incident: Themes Can Bite Too

Another important finding involved a compromised theme path.

In one case, a webshell was found inside a theme directory, and it appeared capable of re-spawning a malicious must-use plugin on demand.

That is nastier than a simple suspicious plugin file.

A must-use plugin, or mu-plugin, loads automatically and does not appear in the normal WordPress plugin screen like regular plugins. That makes it attractive for persistence.

The response was not to casually delete one file and celebrate.

The safer response was:

Quarantine the affected theme.
Remove or isolate suspicious files.
Inspect mu-plugins.
Check database options.
Rotate salts.
Kill active sessions.
Harden file permissions.
Disable file editing from wp-admin.
Verify WordPress core checksums.
Reinstall clean plugin/theme copies where needed.
Add server-side rules to prevent PHP execution in risky upload paths.

This reinforced an old WordPress security truth:

Malware rarely cares whether the vulnerable doorway was a plugin, theme, snippet manager, backup tool, or forgotten file. It only cares that the doorway exists.

Why Logs Became the Hero of the Story

At some point, every security investigation becomes a log investigation.

Without logs, you are guessing.

With logs, you can ask better questions:

Did the attacker hit wp-login.php?
Was there a successful POST?
Which IP appeared before the admin action?
Was the request proxied through Cloudflare?
Did the origin log show the real visitor IP or only Cloudflare edge IPs?
Did suspicious activity happen before or after a cleanup?
Are current alerts new or just old findings in stale logs?
Are 403s expected blocks or broken site behavior?
Are 499s actual failures or client-side disconnects?

One of the biggest improvements after the migration was log quality.

Initially, some sites were not logging Cloudflare real visitor IPs in the most useful way. Since all migrated sites were Cloudflare-proxied, origin logs showing only Cloudflare edge IPs were not enough for investigation.

So we updated per-site Nginx access logging to use a Cloudflare-aware format that records the real visitor IP from Cloudflare headers.

That made future forensic work much more practical.

We also improved log retention.

Short log retention is convenient until you need to investigate something that happened eight days ago.

The setup was changed so logs retained more history, used compression, and remained manageable. This was one of those boring improvements that only becomes exciting when it saves your neck later.

The lesson:

Logs are not decoration. Logs are memory. And servers with bad memory make terrible witnesses.

Bot Attacks: The Background Noise of WordPress Life

Once the sites were migrated and logging improved, bot activity became much easier to see.

And the bots were not shy.

They hit:

wp-login.php
xmlrpc.php
/wp/xmlrpc.php
/blog/xmlrpc.php
/wordpress/xmlrpc.php
Other XML-RPC path variants
Common hidden paths
Plugin/theme probes

The important thing is that bots do not only attack the exact default path. If you block /xmlrpc.php but forget variants like /wp/xmlrpc.php, some bots will happily test the side door.

So we added Nginx-level XML-RPC blocking using a broader regex rule, not just one narrow path.

After that, the numbers told the story.

XML-RPC requests were overwhelmingly blocked, and bypasses dropped to zero after the patch window. WordPress login attacks were also rate-limited at the origin level.

This was satisfying because it was measurable.

Security should be measurable whenever possible.

Not just:

“We hardened the server.”

But:

“Here are the request counts. Here are the blocked attempts. Here are the bypasses. Here is what changed after the patch.”

That turns server management from superstition into operations.

And yes, seeing thousands of bot requests get blocked is oddly satisfying. Like watching mosquitoes hit an electric racket.

The UFW Origin Lockdown: The Firewall Change That Needed Respect

One of the biggest hardening goals was to protect port 443 so only Cloudflare could reach the origin directly.

The idea was simple:

Visitors access sites through Cloudflare.
Cloudflare connects to the origin.
Random internet traffic should not hit the origin directly over HTTPS.

This reduces direct-origin exposure and makes it harder for attackers to bypass Cloudflare protections.

But firewall changes are powerful. Powerful things deserve paranoia.

The first attempt did not go perfectly.

A few sites showed Cloudflare-side issues after the lockdown attempt, especially sites with previous Ezoic/Middleton routing history. So the change was rolled back.

That rollback was not failure. It was process working correctly.

The worst firewall story is not:

“We tried a rule and rolled back.”

The worst firewall story is:

“We changed rules across all production sites, broke some of them, had no clean backup, and then guessed for three hours.”

After rollback, we investigated.

Some failures looked like verification artifacts. Some involved previous routing layers. Some needed cleaner testing conditions. We did not immediately retry in panic mode.

Then the former Ezoic-routed sites were disconnected from old DNS paths and moved to direct Cloudflare-proxied DNS. We verified public health, checked headers, watched for old integration traffic, and waited for the setup to stabilize.

Only then did we retry the origin lockdown more carefully.

The successful retry included:

Backups of UFW rules.
Cloudflare IP allow rules for 443.
Baseline ports preserved, including SSH and CloudPanel.
Slower verification.
UFW logging enabled.
Checks for blocked official Cloudflare IPs.
Homepage tests.
/wp-json/ tests.
wp-login.php tests.
wp-admin redirect tests.
Watch for 502, 520, and 522 errors.

The retry passed.

The final state was much stronger:

HTTPS origin access allowed only from Cloudflare ranges.
No generic “443 Anywhere” rule left.
SSH and control panel access preserved.
All sites healthy through Cloudflare.
No official Cloudflare IPs blocked.

The lesson:

A good rollback is not a sign of weakness. It is what gives you permission to do serious work.

Monitoring: From One-Time Cleanup to Daily Confidence

After the migration and hardening, the next challenge was ongoing monitoring.

Because cleaning a server once is nice.

Knowing it stayed clean is better.

We built a daily malware scanner/reporting flow and a weekly digest.

The daily report checked for things like:

Suspicious malware signatures
WPCode / Insert Headers reinfection traces
Active plugin references
Bad database options
Published malicious snippet posts
Suspicious capabilities
Known recurring patterns
Site health indicators

The weekly digest summarized broader security health, including bot protection numbers.

This was integrated with Telegram reporting, so important issues would not quietly rot in a log file no one reads.

This is where Hermes became especially useful.

Instead of only responding to emergencies, Hermes helped shape a repeatable monitoring system:

Scanner runs.
Reporter parses.
Severity is classified.
Critical findings are surfaced.
Review items are separated from urgent items.
Known false positives are handled carefully.
Reports are saved.
Telegram gets a concise summary.

A good alerting system does not just scream.

It tells you what matters.

One improvement we made was severity classification. A leftover benign folder should not create the same emotional reaction as active reinfection.

The goal was to avoid alert fatigue.

If every message says “danger,” eventually none of them do.

The “False Positive” Problem: Not Every Alert Is an Emergency

One recurring issue involved review alerts that were not true active compromises.

For example, one site had a leftover Sucuri-related cache folder in uploads. It looked suspicious enough to deserve review, but after inspection, it was benign orphaned plugin cache.

Another site was temporarily not resolving in DNS, causing a homepage corruption-style alert. That was not malware. It was a DNS state issue.

So the monitoring system had to become smarter.

Not softer. Smarter.

There is a difference.

We added DNS-aware suppression for a specific case, designed so it would auto-lift when DNS resolved again. That way the alert system did not keep shouting about a condition that was already understood, while still preserving future detection.

This is a subtle but important operational principle:

Do not silence alerts because they are annoying. Refine alerts because they are imprecise.

That difference separates monitoring from wishful thinking.

The Hidden Enemy: Stale Migration Artifacts

One of the less glamorous but important cleanup steps involved stale migration files.

During migrations, it is common to create temporary files:

Database dumps
Tarballs
Backup archives
wp-config copies
New-server stubs
Plugin snapshots
Test scripts
Temporary directories

These are useful during the move.

They are dangerous when forgotten.

Some migration stubs may contain database credentials. Some archives may expose full site copies if left under web-accessible paths. Some temporary files may confuse future audits.

In our case, stale wp-config-style stubs were identified and removed. Migration tarballs were moved out of temporary memory-backed locations into a proper disk archive. Server memory and swap pressure improved after cleanup.

This is not the exciting part of server management.

But it is the kind of cleanup that prevents future disasters.

The migration is not complete when the site loads.

The migration is complete when the mess behind the move has also been cleaned.

WordPress Fleet Management: One Site Is a Website, Eighteen Sites Are Infrastructure

Managing one WordPress site is mostly website administration.

Managing 18 WordPress sites on a VPS is infrastructure management.

That means you need fleet-level thinking.

You need to ask:

Are all sites using consistent logging?
Are all sites protected against XML-RPC variants?
Are all sites behind Cloudflare?
Are all sites included in malware scans?
Are all sites included in weekly summaries?
Are all sites returning expected status codes?
Are all admin panels protected?
Are old plugins removed from all sites?
Are old folders quarantined outside loadable paths?
Are backups stored safely?
Are exceptions documented?

Consistency matters because attackers love the forgotten site.

You may harden your biggest website, but if a smaller abandoned domain has an old vulnerable plugin, that site can become the weak link.

The smallest site on the server can create the biggest headache.

In our case, after the original 18-site migration, an additional WordPress site later joined the monitoring scope. That was another useful reminder: your fleet changes. Your inventory must change with it.

Static documentation dies quickly.

Living documentation is part of the system.

The Human-in-the-Loop Rule

One important thing I learned from working with Hermes is that AI agents are most useful when they are not treated as magical autonomous creatures.

For server administration, the best model was human-in-the-loop.

Hermes could prepare commands, summarize logs, generate scripts, compare states, and recommend next actions.

But the rule was:

No blind destructive changes.
No mystery automation.
No “just trust me bro” server work.

Before dangerous operations, the workflow required:

A backup.
A clear goal.
A minimal command.
Expected output.
Rollback path.
Verification step.

This is especially important for people using AI agents with servers.

An AI agent can make you faster.

It can also make you faster at breaking things.

So the process matters more than the tool.

The best agent is not the one that runs the most commands. The best agent is the one that helps you avoid unnecessary commands.

That sounds less glamorous, but production servers prefer boring wisdom.

What Actually Made the Migration Successful

Looking back, the success of this project did not come from one heroic command.

It came from a few boring-but-powerful practices repeated consistently.

1. Backups Before Touching Anything Serious

Before major changes, we created backups or snapshots of relevant files, configs, databases, plugin states, and user states.

This made it possible to move confidently.

A backup is not just disaster recovery. It is emotional support for sysadmins.

2. One Change at a Time

When possible, we avoided stacking too many changes together.

If you change DNS, Nginx, firewall rules, plugins, cache, and database options at the same time, debugging becomes a murder mystery where every suspect has an alibi.

One change. Verify. Then continue.

3. Evidence-Based Diagnosis

When WPCode reinfections appeared, we did not assume the cause. We checked logs.

When origin lockdown failed, we did not assume Cloudflare was broken. We rolled back, investigated, then retried safely.

When alerts appeared, we separated active threats from stale findings.

That discipline prevented overreaction.

4. Fleet-Level Consistency

Once a fix worked, it was turned into a repeatable pattern.

Log retention. Cloudflare real IP logging. XML-RPC blocking. Malware checks. Weekly digests.

The goal was not to fix one site beautifully and leave the rest behind.

The goal was to raise the baseline for the whole fleet.

5. Documentation as a First-Class Output

Every meaningful change produced notes or reports.

This became incredibly useful later because we did not have to rely on memory.

In infrastructure work, undocumented fixes become future mysteries.

And future you has enough problems.

Practical Lessons for VPS and WordPress Site Owners

If you manage multiple WordPress sites on a VPS, here are the biggest lessons from this migration.

Keep a Real Site Inventory

Know every domain on the server.

For each site, track:

Domain
Server path
Database name
PHP version
Cloudflare status
Backup status
Important plugins
Admin users
Security exceptions
Monitoring inclusion

If you do not maintain inventory, your server will maintain surprises.

Treat WordPress Admin Compromise Seriously

If malware appears through snippets, plugin editors, or admin actions, do not only delete the visible payload.

Also rotate salts, clear sessions, reset passwords, review admin users, and disable file editing.

Attackers with valid sessions do not need fancy exploits.

They already have the keys.

Block XML-RPC Properly

If your sites do not need XML-RPC, block it at the Nginx level.

Do not only block the obvious path. Bots test variants.

Also monitor the result. A block you do not measure is just a comforting assumption.

Log Real Visitor IPs Behind Cloudflare

If Cloudflare is proxying your sites, make sure your origin logs are useful.

During an incident, seeing only Cloudflare edge IPs can make investigation painful.

Real visitor IP logging helps connect login attempts, admin actions, and suspicious behavior.

Do Not Leave Migration Junk Behind

Temporary files become permanent risks when ignored.

After migration, clean:

Old database dumps
Tarballs
Config stubs
Test files
Temporary plugin copies
Web-accessible backups

A successful migration should leave the server cleaner than it found it.

Use AI Agents for Structure, Not Blind Automation

An agent like Hermes is extremely useful for maintaining process, summarizing evidence, preparing scripts, and enforcing checklists.

But server work still needs human judgment.

The winning formula is not “AI runs everything.”

The winning formula is:

Human judgment + agent discipline + verified commands + documentation.

That is where the magic actually happens.

The Emotional Side of Server Migration

Nobody talks enough about the emotional side of managing servers.

When you run content sites, the server is not just infrastructure. It is tied to income, traffic, rankings, and years of work.

So when something breaks, it does not feel like a technical issue.

It feels personal.

A homepage returning 500 is not just a status code. It is instant blood pressure.

A malware alert is not just a file path. It is a tiny horror movie.

A Cloudflare 522 during firewall testing is not just a timeout. It is your stomach dropping through the floor.

That is why process matters.

Process keeps you from making fear-based decisions.

It turns:

“Everything is broken!”

Into:

“Homepage 200. Login 200. Admin 302. XML-RPC 403. No new WPCode traces. Logs clean. Continue.”

That shift is huge.

Good operations reduce anxiety because they replace vague fear with specific facts.

Final Thoughts: The Server Is Healthier Than Before

The migration began as a move from one hosting environment to another.

But by the end, the server was not merely migrated.

It was better understood.

The 18 WordPress sites had:

Cleaner migration state
Better logging
Longer log retention
Cloudflare-aware real IP visibility
XML-RPC protection
Login attack rate limiting
Malware monitoring
WPCode reinfection detection
Telegram reporting
Weekly security digesting
Safer firewall posture
Better documentation
Repeatable cleanup playbooks

That is the real win.

A migration that only moves files gives you a new server.

A migration that audits, cleans, hardens, and documents gives you a better operation.

And that is the lesson I would share with anyone managing multiple WordPress sites on a VPS:

Do not treat migration as a transportation job.

Treat it as a rare opportunity to understand your infrastructure.

Because once you look closely, you will find things.

Some will be messy.

Some will be funny.

Some will make you question your life choices.

But if you handle it with backups, logs, patience, and a good system, you come out with something far more valuable than a migrated server.

You come out with confidence.

And in the VPS world, confidence is not the feeling that nothing will ever go wrong.

Confidence is knowing that when something does go wrong, you have the tools, process, and evidence to deal with it.

Preferably before your coffee gets cold.