DAILY NEWS

Stay Ahead, Stay Informed – Every Day

Advertisement
How I built a 6-node 12-GPU on-prem AI cluster running 1000+ agents


TL;DR — 6 machines, 12 GPUs, 1,000+ concurrent agents, P95 18 ms, voice

Why I built this

I’m Franck. Toulouse, France. Over 3 years I paid roughly €280,000 to Azure + OpenAI before doing the math properly:

Latency: 1.2s voice round-trip — incompatible with the voice-first UX I wanted.

Compliance: customer data on US servers. Not GDPR-native, just GDPR-compliant-on-paper.

Quotas: random throttling at the worst times.

Lock-in: Azure outage = my product offline.

I decided to rebuild everything on-prem. This is the result.

The cluster

6 machines, 3 tiers, 12 GPUs total,

Tier 1 — GPU compute (heavy inference)

M1 “La Créatrice” — Ryzen 5700X3D, 6× RTX 3080+, 46 GB RAM. Primary LLM node, runs qwen3.5-9b, qwen3.5-35b-a3b, deepseek-r1, the Claude 4.5/4.6 distillations, and the Whisper CUDA pipeline.

M2 “Le Forge” — multi-GPU NVIDIA, secondary inference, failover from M1 in 1.3s.

Tier 2 — CPU/RAM (orchestration, memory)

M3 “Le Cerveau” — high-RAM CPU node. PostgreSQL + Redis + Pinecone. Runs the orchestrator, the 3-quorum consensus engine (M1+M2+M3), and the analytics/monitoring agents.

Tier 3 — production / work

M4 “Bridge Windows” — Windows 11, 2 GPUs, trading bot live.

M5 “Interface Relay” — Linux i5-6500, 15 GB RAM. Dev interface, 15+ MCP servers, Claude Code.

M6 “Mobile Ops” — laptop. SSH + VPN. Client demos and on-site ops.

The 9 layers I added on top of Ubuntu

L9 — Vocal / conversational (Whisper CUDA STT, Piper TTS, wake word, 50+ languages)
L8 — Multi-agent orchestration (MCP-native, consensus engine)
L7 — Trading consensus engine (multi-model voting GPT/Gemini/Claude)
L6 — Browser + web automation (Chrome DevTools Protocol)
L5 — MCP tool registry (88+ handlers)
L4 — GPU cluster management (Docker Swarm, failover
L3 — Domino pipeline engine (835 chains)
L2 — systemd service layer (98 units)
L1 — Linux boot integration (GRUB hooks, ZRAM, kernel params)

Real numbers

Metric
Value

Concurrent agents
1,000+

P95 latency (cluster internal)
18 ms

Voice pipeline end-to-end

Aggregate throughput
67 tok/s

Python lines
280,741

Public repos
44 (all MIT)

Cost comparison (1M tokens/day, team of 10)

Provider
€/month
P95
Concurrent agents
Data residency

Azure OpenAI
1,500
800ms-3s
~20
US

AWS Bedrock
1,800
700ms-2.5s
~15
US

Mistral Cloud
800
400-800ms
~30
EU

JARVIS OS
0
18 ms
1,000+
Air-gapped

For a 50K€ turn-key deployment, break-even vs Azure is 7 months, and the marginal cost is zero after that.

What I sell now

JARVIS OS turn-key — 20K€ to 250K€ depending on scope.

62 PDF trainings — from €39, 293h of content based on production code (+48 private).

IA infra audit — €1,500, report in 48h.

1-to-1 mentorship — €250/h.

Fractional CTO — TJM €1,000-1,150 / CDI €85-95K. Toulouse / remote.

Honest weaknesses

Consensus voting is empirical. No formal verification of the agreement function.

Tier-2 failure (M3 down) is the weakest scenario — orchestrator dies, cluster keeps inferring but loses persistent memory.

MCP protocol bet — if Anthropic deprecates parts of MCP, I have 88 handlers to refactor.

kWh-per-token efficiency — cloud probably wins on aggregate watts/token, on-prem wins on marginal cost.

Links



Source link

How to Price Options at the Institutional Level Using AI (PINNs) and Python



If you work or study the derivatives market, you know that speed and accuracy in calculating options prices are not just technical goals — they are competitive differentiators. Traditionally, we rely on the Black-Scholes model or Monte Carlo simulations structured in legacy code to approximate the fair price of a contract. However, when we need to scale these calculations to thousands of simultaneous requests or handle complex boundary conditions, processing bottlenecks appear. This is where the fusion between Artificial Intelligence and Financial Physics comes in: PINNs (Physics-Informed Neural Networks). In this article, I will show you how to consume an institutional-grade infrastructure based on physics-informed neural networks to price options in milliseconds using Python. What are PINNs and why do they matter in finance?Unlike traditional neural networks that need billions of historical data to “learn” a trend (and that often fail when trying to extrapolate), PINNs integrate mathematical laws directly into their loss function. mathematical hallucinations and we achieve absurdly fast inference power, ideal for high-frequency systems (HFT) and real-time risk management. Hands-on: Consuming PINN Master in PythonTo avoid having to assemble, train and host a cluster of GPUs to run this network from scratch, we will use PINN Master – Institutional Option Pricing, a robust API hosted in AZURE that exposes this model ready for production. Best of all? It has a 100% free tier for testing. Step 1: Get your credentialsBefore running the script, you just need to access the official PINN Master page on RapidAPI and subscribe to the free plan to release your access token. If you have any questions about getting started, there is a very simple to follow Official PINN Master Startup Tutorial. Step 2: The CodeWith your key in hand, use the code below to make a call to price a call: import requests # High performance API endpoint url = “https://pinn-master-institutional-option-pricing.p.rapidapi.com/v1/price” # Contract pricing parameters querystring = { “spot”: “100.0”, # Current price of the underlying asset “strike”: “100.0”, # Option strike price “volatility”: “0.20”, # Implied volatility (20%) “rate”: “0.05”, # Risk-free interest rate (5%) “maturity”: “1.0”, # Time to expiration (1 year) “type”: “call” # Contract type: call or put } headers = { “X-RapidAPI-Key”: “YOUR_FREE_CHAVE_AQUI”, “X-RapidAPI-Host”: “pinn-master-institutional-option-pricing.p.rapidapi.com” } try: response = requests.get(url, headers=headers, params=querystring) response.raise_for_status() dados_precificacao = response.json() print(“— PINN Master Invocation Result —“) print(dados_precificacao) except requests.exceptions.RequestException as e: print(f”Error connecting to quant infrastructure: {e}”) Enter fullscreen mode Exit fullscreen mode Why is this approach a game changer? Predictable Latency: By transferring the complexity of the mathematical calculation to an optimized neural network inference in the cloud, you gain homogeneous response time. Infrastructure Abstraction: The entire scalability architecture in AZURE is hidden behind a clean GET method. Easy Integration: You can plug this return directly into trading dashboards, dynamic spreadsheets or order execution bots.



Source link

How I Discovered and Deobfuscated a Hidden PHP Backdoor on My Server


As developers and system architects, we often secure our code but neglect the silent threats lurking in old directories or clever obfuscations. Recently, I caught a stealthy PHP backdoor ((random_name).php) embedded in a system.

Instead of just deleting it, I decided to perform a full reverse engineering to understand exactly how it works, how it bypasses scanners, and how it maintains persistence on a server.

Here is a quick summary of what I found during the analysis.

🔍 The Anatomy of the MalwareAt first glance, the file was heavily obfuscated using multiple layers of encoding to look like harmless gibberish. However, the core mechanism relied on a classic but dangerous pattern:

PHP// The malicious pattern used to execute hidden codeeval(base64_decode($_POST(‘encoded_payload’)));Key Techniques Used by the Attacker:Layered Obfuscation: The code utilized deep base64 nesting combined with string manipulation functions to evade signature-based security scanners.

Hidden Tar Extraction: Deep inside the encoded strings, the malware contained a compressed TAR structure. Once triggered, it extracts a full-featured web shell into the server directories.

SSH Persistence: The ultimate goal wasn’t just to execute commands once—the script was designed to inject malicious public keys into the server’s ~/.ssh/authorized_keys file, granting the attacker permanent, direct SSH access without leaving a footprint in the web logs.

🛠️ How to Protect Your SystemIf you suspect your server has been compromised, simply deleting the .php file might not be enough. You need to:

Check your ~/.ssh/authorized_keys for unauthorized entries.

Audit your system cronjobs to ensure the malware doesn’t have a re-infection script scheduled.

Implement strict file permissions (chmod 644 for files, 755 for directories) and disable dangerous PHP functions like eval(), exec(), and passthru() in your php.ini.

📖 Read the Full Deep DiveI have documented the complete step-by-step deobfuscation process, the code breakdown, directory structures, and full remediation steps on GitHub.

👉 See full analysis and source code breakdown here:

https://github.com/KhaiTrang1995/Malware-Analysis-Reports-PHP-Backdoor

Alternatively, you can view the repository directly:

Tags: #php #security #devsecops #malware



Source link