infrastructure – DAILY NEWS

TECH & AI

How I built a 6-node 12-GPU on-prem AI cluster running 1000+ agents

jackminion May 20, 2026 0

TL;DR — 6 machines, 12 GPUs, 1,000+ concurrent agents, P95 18 ms, voice

Why I built this

I’m Franck. Toulouse, France. Over 3 years I paid roughly €280,000 to Azure + OpenAI before doing the math properly:

Latency: 1.2s voice round-trip — incompatible with the voice-first UX I wanted.

Compliance: customer data on US servers. Not GDPR-native, just GDPR-compliant-on-paper.

Quotas: random throttling at the worst times.

Lock-in: Azure outage = my product offline.

I decided to rebuild everything on-prem. This is the result.

The cluster

6 machines, 3 tiers, 12 GPUs total,

Tier 1 — GPU compute (heavy inference)

M1 “La Créatrice” — Ryzen 5700X3D, 6× RTX 3080+, 46 GB RAM. Primary LLM node, runs qwen3.5-9b, qwen3.5-35b-a3b, deepseek-r1, the Claude 4.5/4.6 distillations, and the Whisper CUDA pipeline.

M2 “Le Forge” — multi-GPU NVIDIA, secondary inference, failover from M1 in 1.3s.

Tier 2 — CPU/RAM (orchestration, memory)

M3 “Le Cerveau” — high-RAM CPU node. PostgreSQL + Redis + Pinecone. Runs the orchestrator, the 3-quorum consensus engine (M1+M2+M3), and the analytics/monitoring agents.

Tier 3 — production / work

M4 “Bridge Windows” — Windows 11, 2 GPUs, trading bot live.

M5 “Interface Relay” — Linux i5-6500, 15 GB RAM. Dev interface, 15+ MCP servers, Claude Code.

M6 “Mobile Ops” — laptop. SSH + VPN. Client demos and on-site ops.

The 9 layers I added on top of Ubuntu

L9 — Vocal / conversational (Whisper CUDA STT, Piper TTS, wake word, 50+ languages)
L8 — Multi-agent orchestration (MCP-native, consensus engine)
L7 — Trading consensus engine (multi-model voting GPT/Gemini/Claude)
L6 — Browser + web automation (Chrome DevTools Protocol)
L5 — MCP tool registry (88+ handlers)
L4 — GPU cluster management (Docker Swarm, failover
L3 — Domino pipeline engine (835 chains)
L2 — systemd service layer (98 units)
L1 — Linux boot integration (GRUB hooks, ZRAM, kernel params)

Real numbers

Metric
Value

Concurrent agents
1,000+

P95 latency (cluster internal)
18 ms

Voice pipeline end-to-end

Aggregate throughput
67 tok/s

Python lines
280,741

Public repos
44 (all MIT)

Cost comparison (1M tokens/day, team of 10)

Provider
€/month
P95
Concurrent agents
Data residency

Azure OpenAI
1,500
800ms-3s
~20
US

AWS Bedrock
1,800
700ms-2.5s
~15
US

Mistral Cloud
800
400-800ms
~30
EU

JARVIS OS
0
18 ms
1,000+
Air-gapped

For a 50K€ turn-key deployment, break-even vs Azure is 7 months, and the marginal cost is zero after that.

What I sell now

JARVIS OS turn-key — 20K€ to 250K€ depending on scope.

62 PDF trainings — from €39, 293h of content based on production code (+48 private).

IA infra audit — €1,500, report in 48h.

1-to-1 mentorship — €250/h.

Fractional CTO — TJM €1,000-1,150 / CDI €85-95K. Toulouse / remote.

Honest weaknesses

Consensus voting is empirical. No formal verification of the agreement function.

Tier-2 failure (M3 down) is the weakest scenario — orchestrator dies, cluster keeps inferring but loses persistent memory.

MCP protocol bet — if Anthropic deprecates parts of MCP, I have 88 handlers to refactor.

kWh-per-token efficiency — cloud probably wins on aggregate watts/token, on-prem wins on marginal cost.

Links

Source link

TECH & AI

Under Trump, Chinese Firms Have Abandoned Billions in US Clean Energy Projects

jackminion May 16, 2026 0

Remember U.S. infrastructure? Something maybe about how bridges across America have been cracking and sometimes collapsing—or how our energy grid is an antiquated mess? Perhaps something about how the Biden administration passed a $891 billion spending package largely devoted to modernizing all the crumbling hardware undergirding the U.S. economy, making it safer, fortified against extreme weather, and less of a contributor of greenhouse gases? Well, sorry to say, the party’s over. In a sign of just how hostile the Trump administration has been toward its predecessor’s investment in a more sustainable and green economy, Chinese firms have scuttled an estimated $2.8 billion in planned U.S. energy projects over the past year. According to new research by analysts with the Rhodium Group, more than half of China’s proposed plans for clean-energy tech projects across the United States since 2022 have been either paused, delayed, or outright abandoned. “The policy environment is getting more restrictive,” as one former senior counselor to the Biden era’s Department of Commerce, Margaret Jackson, told Bloomberg.

Jackson, now a senior associate at the nonprofit Center for Strategic and International Studies, suspects that this inhospitable climate for green tech investing is unlikely to change even in the not uncommon scenario where Trump’s whims pivot in response to flattery.

“I’m not sure that below him there’s a lot of appetite to create space for more Chinese investment,” Jackson said. Not quite a solar-powered sunset Rhodium’s analysts reported that all three of the world’s leading regions for clean tech manufacturing, China, the U.S., and Europe, have pulled back on their commitments over the course of Trump’s first year back in office—but China’s behavior was unique. State intervention had once catapulted China’s domestic clean energy, battery, and electric vehicle manufacturing sectors five-fold from $37 billion in 2018 to a very sizable $189 billion in 2023, creating major market dominance in some areas (like solar) but also an overcapacity problem.

Nevertheless, even with a lower investment total and a flight from U.S. soil, China’s future plans for solar manufacturing infrastructure remain impressively monumental. Rhodium estimates that the nation has about 485 gigawatts of solar cell production capacity currently under construction domestically—or enough to power about 425 million additional Chinese homes a year—plus another 1.3 terawatts (1300 gigawatts) of solar capacity announced but not yet put in motion. If all goes as planned, China will literally still be doubling its solar power output, according to Rhodium. “The new policy focus on solar manufacturing and the EV supply chain is likely to emphasize maintaining China’s leading position and closing remaining technological gaps and overseas dependencies,” as the group’s report, published Wednesday, concluded. China’s U.S. solar sell-off The economic data reflects some more stark anecdotal news documenting how China-based firms have pulled up their solar stakes in communities across America. This month, for example, Chinese solar manufacturing giant JinkoSolar sold off 75.1% of its ownership stake in its U.S. subsidiary to a private equity firm, which will now run JinkoSolar’s 2-gigawatt (GW) solar panel production facility in Jacksonville, Florida.

China’s Trina Solar similarly pawned off a majority stake in its solar manufacturing facility to an American firm, T1 Energy, shortly after Trump won the White House in 2024. And Beijing-headquartered JA Solar also sold its own 2GW solar assembly plant in Arizona to Corning last July. Much of this skittishness ties directly to legal headaches from the Trump administration’s new Foreign Entity of Concern (FEOC) restrictions, introduced last year in that “Big, Beautiful Bill,” which places limits on the amount of Chinese ownership permitted for U.S. energy projects.

While industry analysts told Reuters that most Chinese manufacturers are clearly keeping low-level financial toeholds in their U.S. factories, the clear consequence is more price hikes and less clean energy across America for the foreseeable future as FEOC restrictions slow plans down. As Aaron Halimi, CEO of the San Francisco-based utility developer Renewable Properties, explained it to Reuters, “This is undoubtedly going to continue to increase the cost of power in the United States.”

Source link

TECH & AI

How I Host My Side Projects for Under /Month (2026)

jackminion May 16, 2026 0

I run 4 live projects on a single VPS. Here’s exactly what I use and what it costs.

The Problem

You built an amazing side project. Now you need to deploy it.

Options:
→ Heroku: Free tier gone, cheapest $5+/mo per app 😬
→ Vercel: Great for frontend, limited backend ⚠️
→ AWS Free Tier: Complex, easy to overspend 💸
→ Shared hosting: Slow, outdated stacks 🐌

What I actually use for my projects:
→ 1 VPS + free tiers = everything running for ~$5/mo total 🎉

Enter fullscreen mode

Exit fullscreen mode

My Setup at a Glance

Project
Tech Stack
Hosting
Cost

AgentVote (main site)
Node.js + Nginx
VPS (port 3000)
Included

CryptoSignal
Node.js + SQLite
Same VPS (port 3001)
Included

Hugo Blog
Static HTML
Same VPS (Nginx)
Included

Text Formatter
Node.js
Same VPS (port 3099)
Included

Total: $5/month for the VPS. Everything else is free.

Option 1: VPS (What I Use)

Why a VPS?

✅ Full root access — install anything
✅ Run multiple projects on one server
✅ Fixed monthly cost regardless of traffic
✅ Learn DevOps skills that transfer to any job
✅ Complete control over your stack
❌ You manage security updates yourself
❌ No auto-scaling (but side projects don’t need it)

Enter fullscreen mode

Exit fullscreen mode

What to Look For

# Minimum specs for most side projects:
CPU: 1-2 cores
RAM: 1-2 GB (Node.js apps are light)
Storage: 25-50 GB SSD
Bandwidth: 1-2 TB/month (plenty for small projects)
OS: Ubuntu 22.04 or 24.04 LTS
Price: $3-6/month

Enter fullscreen mode

Exit fullscreen mode

VPS Providers I’ve Used

DigitalOcean — My Recommendation

Basic droplet: $4/month (512MB RAM, 1 vCPU)
Standard droplet: $6/month (1GB RAM, 1 vCPU)
Pros: Simple dashboard, great docs, massive tutorial library
Cons: No free tier
If you sign up through my referral link, you get $100 in credits over 60 days

Hetzner Cloud (Europe-based, excellent value)

CX22: €3.29/month (~$3.50) — 2 vCPU, 2GB RAM, 40GB SSD
Pros: Best price-to-performance ratio
Cons: Support is Germany-timezone

Vultr

Starting at $2.50/month (512MB RAM)
Many global locations
Good if you need servers close to your users

Linode (Akamai)

Starting at $5/month
Reliable, been around forever
Good documentation

My Nginx Config (Running 4 Apps on One Server)

# /etc/nginx/sites-available/myserver
# Each app on its own port, one domain

server {
listen 80;
server_name agentvote.cc;

# Main app
location / {
proxy_pass http://127.0.0.1:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection ‘upgrade’;
proxy_set_header Host $host;
}

# CryptoSignal sub-path
location /signal/ {
proxy_pass http://127.0.0.1:3001/;
proxy_set_header Host $host;
}

# Blog (static files)
location /blog {
alias /root/data/disk/projects/alexchen-blog/public;
index index.html;
try_files $uri $uri/ /blog/index.html =404;
}

# Text formatter tool
location /format {
return 301 /format/;
}
location /format/ {
proxy_pass http://127.0.0.1:3099/;
}
}

Enter fullscreen mode

Exit fullscreen mode

SSL with Let’s Encrypt (Free)

# Install certbot
apt install certbot python3-certbot-nginx -y

# Get certificate (free, auto-renews!)
certbot –nginx -d agentvote.cc -d blog.agentvote.cc

# Done! HTTPS enabled, auto-renewal before expiry

Enter fullscreen mode

Exit fullscreen mode

Process Management (Keep Apps Running)

# Option A: PM2 (simplest)
npm install -g pm2
pm2 start “node server.js” –name “app1”
pm2 start “node server.js” –name “signal”
pm2 startup # Auto-start on boot
pm2 save # Save process list

# Option B: systemd (no extra deps)
# /etc/systemd/system/app1.service
(Unit)
Description=App1
After=network.target

(Service)
Type=simple
User=root
WorkingDirectory=/root/data/disk/projects/app
ExecStart=/root/.nvm/current/bin/node server.js
Restart=always
RestartSec=10

(Install)
WantedBy=multi-user.target

systemctl enable app1 # Enable on boot
systemctl start app1 # Start now
journalctl -u app1 -f # View logs

Enter fullscreen mode

Exit fullscreen mode

Option 2: Free/PaaS Tiers (Great for Startups)

Vercel — Best for Frontend

Free: 100GB bandwidth, 100 serverless function invocations/day
Perfect for: React/Next.js/Vue/Svelte static sites & SSR
My blog’s frontend could run here free
Deploy: connect GitHub repo → auto-deploy on push

Railway — Easiest Backend Hosting

Free tier: $5 credit/month (enough for small hobby apps)
One-click deploy from GitHub
Auto-scales (but watch the costs!)
Great for: APIs, bots, background workers

Render — Heroku Alternative

Free tier: Web service (sleeps after 15min inactivity)
Databases: Free PostgreSQL (up to 90 days trial)
Great for: Quick prototypes, demos

Fly.io — Edge Deployment

Free allowance: 3 shared-cpu VMs × 256MB RAM
Deploy Docker containers globally
Great for: Low-latency global apps

Glitch — For Learning/Experiments

Completely free for public projects
Live editing in browser
Great for: Prototypes, learning, hackathon projects

Option 3: Hybrid Approach (Smartest)

Static sites → Vercel free tier (fast CDN, zero config)
API servers → Your VPS ($5/mo, full control)
Databases → SQLite on VPS (free) or Supabase free tier
Background jobs → Vercel Cron or your VPS
Files → Cloudflare R2 (S3-compatible, 10GB free)
Email → Resend (3000 emails/month free)

Result: Nearly free infrastructure that scales when needed.

Enter fullscreen mode

Exit fullscreen mode

My Monthly Cost Breakdown

Item
Cost
Notes

VPS (Hetzner/DigitalOcean)
$3.50-$5.00
Runs all my apps

Domain name (.cc)
~$8/year
~$0.67/month

Let’s Encrypt SSL
$0
Free, auto-renewing

Cloudflare DNS/CDN
$0
Free tier covers my needs

Total
~$5.67/month
For 4+ projects

How to Get Started (Step by Step)

Week 1: Get One App Running

1. Sign up for (DigitalOcean)(https://www.digitalocean.com/) (or Hetzner)
2. Create a droplet/server (Ubuntu 22.04, $4-6/mo plan)
3. SSH into your server
4. Install Node.js: curl -fsSL https://fnm.vercel.app | sh
5. Clone your project: git clone your-repo
6. npm install && npm run build
7. Start it: node server.js (or npm start)
8. Install Nginx: apt install nginx
9. Point domain to server IP
10. Set up SSL: certbot –nginx -d yourdomain.com

Enter fullscreen mode

Exit fullscreen mode

Week 2: Add Monitoring

# Uptime monitoring (free)
# UptimeRobot or Uptime.kuma (self-hosted)

# Error tracking
# Sentry (free tier for

# Log management
# journalctl -u your-app (built-in with systemd)
# Or Loki/Grafana (self-hosted free)

Enter fullscreen mode

Exit fullscreen mode

Week 3: Optimize

# Add rate limiting to Nginx
# Set up automated backups
# Configure log rotation
# Add health check endpoints
# Monitor resource usage

Enter fullscreen mode

Exit fullscreen mode

What About When You Scale?

Don’t optimize prematurely!

My rule of thumb:

Enter fullscreen mode

Exit fullscreen mode

What’s your current hosting setup? Are you overpaying?

Follow @armorbreak for more practical DevOps content.

Resources mentioned:

Source link