DAILY NEWS

Stay Ahead, Stay Informed – Every Day

Advertisement
Stop Writing Endpoints. Start Defining Systems.



For a long time, I thought building APIs meant writing endpoints.

You know the pattern:

Define a route
Validate input
Query the database
Transform the result
Send a response

Do that over and over again.

Different routes. Same structure.

The Illusion of Control

Writing endpoints feels productive.

You’re in control of everything:

The logic
The validation
The data flow

But after a while, something becomes obvious:

You’re not building systems.

You’re repeating patterns.

The Real Problem

Most APIs look like this:

app.get(‘/users/:id’, async (req, res) => {
const id = req.params.id;

if (!id) {
return res.status(400).json({ error: ‘Missing id’ });
}

const user = await db.users.findById(id);

if (!user) {
return res.status(404).json({ error: ‘Not found’ });
}

return res.json(user);
});

Enter fullscreen mode

Exit fullscreen mode

Now multiply that by:

Dozens of endpoints
Multiple resources
Different validation rules
Slight variations in logic

You end up with:

Repeated code
Inconsistent patterns
Hard-to-maintain systems

You’re Not Writing Logic. You’re Rewriting Structure.

Look closer at most endpoints.

They follow the same shape:

Extract input
Validate input
Execute query
Handle errors
Return response

The structure doesn’t change.

Only the details do.

So why are we rewriting the structure every time?

The Shift: Define, Don’t Rewrite

Instead of writing endpoints…

Define them.

What if your API looked like this instead?

get:
user:
GetUserById:
input:
id: number
where:
id: $param.id
response:
id: number
name: string
email: string

Enter fullscreen mode

Exit fullscreen mode

No route handler.

No repeated boilerplate.

Just a definition.

What This Changes

When you define systems instead of writing endpoints:

Structure becomes consistent
Validation becomes automatic
Queries become predictable
Behavior becomes visible

You’re no longer guessing how something works.

You can read it directly.

From Endpoints to Systems

Traditional approach:

Every endpoint is custom
Logic is scattered
Behavior is implicit

System-driven approach:

Endpoints follow a pattern
Logic is structured
Behavior is explicit

You move from “code-first” to “contract-first.”

Where the Code Goes

This doesn’t eliminate code.

It moves it.

Instead of writing endpoint logic repeatedly…

You write:

A compiler that reads definitions
A pipeline that executes them
A system that enforces rules

Code becomes the engine.

Not the repetition.

Example Flow

With a system-driven approach, a request might flow like this:

Request → Parse Definition → Validate → Build Query → Execute → Format Response

Enter fullscreen mode

Exit fullscreen mode

The difference is:

The flow is constant
The behavior is defined in configuration

Why This Matters

Without this approach:

Every developer writes endpoints differently
Bugs are repeated across routes
Refactoring becomes painful

With this approach:

Patterns are enforced
Behavior is predictable
Systems scale cleanly

“Isn’t This Less Flexible?”

Yes.

And that’s the point.

Unlimited flexibility leads to:

Inconsistency
Complexity
Fragile systems

Constraints lead to:

Where This Fits

This kind of system works best when:

You have repeated CRUD patterns
You want consistent APIs
You care about long-term maintainability

It doesn’t replace every use case.

But it replaces most of the boring, repetitive ones.

The Bigger Idea

This isn’t just about APIs.

It’s about how we build software.

Instead of:

Writing everything manually
Repeating patterns
Hoping for consistency

We can:

Define systems
Enforce structure
Let the engine handle execution

Final Thought

Writing endpoints feels like control.

But it’s often just repetition.

Defining systems feels restrictive at first.

But it leads to something better:

Clarity.

Consistency.

Scalability.

That’s why I stopped writing endpoints…

…and started defining systems.



Source link

PCIe Device Passthrough: NIC Name Instability and MAC Pinning



My Proxmox node rebooted, and suddenly the host was unreachable via SSH. I had to plug in a physical monitor and keyboard only to find that my primary network interface, which had been enp4s0 for months, had decided to rename itself to enp5s0.

Because my /etc/network/interfaces file was explicitly tied to enp4s0, the bridge didn’t come up, the IP wasn’t assigned, and I was locked out of my own hardware.

What I expected

I expected the Linux kernel to consistently enumerate my PCIe devices. In a static hardware environment where nothing has moved, the PCI bus address should be deterministic. If the NIC is plugged into the same slot and the BIOS hasn’t changed, enp4s0 should stay enp4s0 forever. This is the “happy path” most documentation assumes.

What actually happened

The reality is that PCIe enumeration is not always a constant. I’m using a mix of onboard NICs and a PCIe expansion card. I also have a GPU passed through to a VM.

The surprise here is how the kernel’s predictable network interface naming (systemd-udevd) interacts with the PCIe topology. When I added a new PCIe device and tweaked some BIOS settings for IOMMU, the way the kernel mapped the physical slots to the virtual naming changed. A slight shift in how the PCIe switch reported the devices caused the index to jump.

This isn’t just a “one-time fluke.” If you’re running a multi-node cluster or using GPUs that might move addresses (something I’ve documented before in GPU PCI Address Instability), you’ll find that the kernel is surprisingly flexible with where it puts things.

The root cause is that enp4s0 is a name derived from the PCI location. If the location changes—even by one digit—the name changes. If your network config depends on that name, your system is one reboot away from a blackout.

The Fix: MAC Pinning

The only way to stop this is to stop relying on the PCI slot location and start relying on the hardware’s unique identifier: the MAC address.

I decided to use systemd .link files. This allows me to tell the kernel: “I don’t care where this device is on the PCIe bus; if it has this MAC address, call it eth0.”

1. Identify the MAC address

First, I had to find the actual MAC of the problematic NIC while I had console access.

ip link show

Enter fullscreen mode

Exit fullscreen mode

I looked for the interface that was currently named enp5s0 (the “wrong” name) and copied the link/ether value.

2. Create the .link file

I created a custom link file in /etc/systemd/network/. I chose the name 10-lan.link to ensure it loads early in the boot process.

# /etc/systemd/network/10-lan.link
(Match)
MACAddress=00:11:22:33:44:55

(Link)
Name=eth0

Enter fullscreen mode

Exit fullscreen mode

(Note: I’ve anonymized the MAC address above. Use your actual hardware MAC here.)

3. Update the network configuration

Once the interface is pinned to eth0, I had to update the Proxmox network configuration to match. I edited /etc/network/interfaces to replace the volatile enp4s0 with the stable eth0.

# Example snippet from /etc/network/interfaces
auto eth0
iface eth0 inet manual

auto vmbr0
iface vmbr0 inet static
address 10.0.0.x/24
gateway 10.0.0.1
bridge-ports eth0
bridge-stp off
bridge-fd 0

Enter fullscreen mode

Exit fullscreen mode

4. Apply and verify

I ran systemd-networkd-restart (or just rebooted, since I was already at the console) and verified the name with ip a. The NIC was now consistently eth0, regardless of whether the PCIe bus shifted.

Why this matters

If you’re just running a single VM on a desktop, this is a minor annoyance. But if you’re building a production-grade homelab, this is a critical failure point.

You’ll hit this specifically in these scenarios:

Adding/Removing PCIe Hardware: Adding a new NVMe drive or a GPU can shift the enumeration of other devices on the same root complex.

BIOS Updates: A BIOS update often resets PCIe lane bifurcation or IOMMU settings, which can completely reorder how the kernel sees your NICs.

Using PCIe Switches: Some high-end motherboards or riser cables use PCIe switches that can report different topologies depending on the power state of the devices.

The Tradeoff

The tradeoff here is that you’re moving away from the “modern” predictable naming convention back to the “old” ethX style. Some people find eth0 ugly or outdated, but in a headless server environment, “ugly” is better than “unreachable.”

I’ve also seen people try to fix this using udev rules in /etc/udev/rules.d/. While that works, .link files are the native systemd way to handle this and are generally cleaner to maintain.

Lessons Learned

The biggest lesson here is that documentation for Proxmox and Debian assumes your hardware topology is a constant. It isn’t.

When you’re doing complex things like PCIe passthrough—which I’ve detailed in my GPU Passthrough Gotcha Guide—you are intentionally messing with the PCI bus. You’re telling the host kernel to ignore certain devices so the VM can claim them. This volatility is a side effect of that power.

If you are passing through NICs or GPUs, do not trust the default interface names. Pin your critical management interfaces to their MAC addresses immediately. It takes five minutes to set up and saves you from a midnight trip to the server rack because a reboot decided your network card now lives at enp6s0.

For those of you managing larger fleets or complex AI agent infrastructure, this kind of hardware-level stability is the foundation. You can’t build a reliable multi-agent AI pipeline if the underlying Kubernetes worker nodes are randomly losing their network identity.

Next time you’re configuring a new node, don’t just copy the enpXsX name from the GUI. Take the extra step to pin it. Your future self will thank you when the next BIOS update doesn’t break your entire cluster.



Source link

The 800ms Barrier: Architecting Interruptible Voice Agents (Lessons from Sarvam AI x Swiggy)



The 800ms Barrier: Architecting Interruptible Voice Agents (Lessons from Sarvam AI x Swiggy)The Signal: The 800ms Latency BarrierIn a research lab, a 3-second delay is an “optimization ticket.” In a live call with a hungry customer on the Swiggy app, 3 seconds is a churn event.

The partnership between Sarvam AI and Swiggy represents a shift in the “Boss Level” of agentic AI. Most developers build voice agents using a Cascaded Pipeline: STT -> LLM -> TTS. The result? A cumulative lag that makes the agent feel like a slow walkie-talkie. To build for the next billion users, you have to architect for Native Audio Streaming and sub-second response times.

Phase 1: The Architectural BetWe are moving from Request-Response to Streaming State Machines.

The Vendor Trap is relying on general-purpose, text-centric models for a multilingual, audio-first market. If you have to translate “Hinglish” to English just to understand an order, you’ve already lost the latency battle.

The Ownership Path is the Indic-Native Stack. Using Sarvam’s natively trained audio models allows us to process speech-to-intent directly. More importantly, we must implement a Bi-Directional WebSocket architecture. This allows the agent to “listen” while it “speaks”—the only way to handle the most difficult part of human conversation: The Barge-in.

Phase 2: Implementation (The Interruptible Voice Handler)In a high-stakes environment like Swiggy, the agent must be able to stop mid-sentence and roll back its logic if the user changes their mind.

// High-Level Logic for an Interruptible Voice Kernel
class VoiceAgentKernel {
constructor(wsConnection) {
this.ws = wsConnection;
this.isSpeaking = false;
this.transactionLock = null; // Ensuring tool-use safety
}

// Detecting the “Barge-in” (Interruption)
onUserSpeechDetected() {
if (this.isSpeaking) {
console.warn(“SIGNAL: Interruption detected. Executing State Rollback.”);
this.killAudioPlayback();
this.abortCurrentLLMGeneration();
this.clearPendingTransactions();
}
}

async handleAudioStream(chunk) {
// Stream raw audio to Sarvam’s native Indic-pipeline
const response = await this.ws.processAudio(chunk);

if (response.intent_confidence > 0.9) {
// Pre-warm tools before the user even stops talking
this.prepareOrderTransaction(response.entities);
}
}

clearPendingTransactions() {
// Essential: Prevents the “Ghost Order” bug
if (this.transactionLock) {
this.transactionLock.cancel();
this.transactionLock = null;
}
}
}

Enter fullscreen mode

Exit fullscreen mode

Phase 3: The Senior Security & Testing AuditI put this Swiggy-scale blueprint through a professional Senior QA & Security Audit. Here is why your “standard” voice agent will fail in the wild.

The “Ghost Order” Race Condition (Logic Fault)The Fault: The agent says “Ordering your Paneer Tikka…” The user interrupts: “No, wait! Make it a Chicken Roll!”The Audit: In naive implementations, the “Order Tool” is triggered the moment the LLM starts talking. If the user interrupts, the audio stops, but the backend API has already committed the Paneer Tikka. You now have a frustrated customer and a wasted order.The Fix: Implement Deferred Commits. The tool-call must remain in a PENDING state until the audio playback reaches a “Commit Threshold” (e.g., 90% completion) or receives a final verbal confirmation.
The “Ambient Audio Injection” (Security Breach)The Fault: The user is ordering food while walking past a loud TV. The TV says “Cancel all orders.”The Audit: Without Speaker Diarization, the agent cannot distinguish between the primary user and background noise. A malicious or accidental “audio injection” can trigger unauthorized actions.The Fix: Use Sarvam’s front-end audio processing to enforce Voice Activity Detection (VAD) with a noise-floor gate. If the audio signal doesn’t match the primary speaker’s decibel profile or spatial characteristics, the kernel must ignore the intent.
The “Colloquial Logic Bypass” (Semantic Security)The Fault: Your security prompts are in English, but the user is speaking a dialect-heavy mix of Hindi and regional slang.The Audit: Traditional English-centric guardrails often miss the nuance of regional insults or “Hinglish” social engineering attempts used to trick the agent into granting a 100% discount.The Fix: Security filters must be Indic-Native. By using Sarvam’s regional guardrails, we ensure that semantic boundaries are enforced at the phoneme level, not just the translation level.

Phase 4: Checklist (The Architect’s Standard)( ) Native Audio or Bust: If you are still converting audio to text before processing intent, your latency will never hit the 800ms gold standard.

( ) Transactional Barge-in: Verify that every interruption triggers a State Rollback for any pending API calls.

( ) Acoustic Hardening: Test your agent against 60dB of background “street noise” to ensure VAD stability.

( ) Regional Edge-Cases: Audit your “Hinglish” logic. Does your agent understand the difference between a user “asking for a discount” and a user “threatening to cancel”?

The Bottom Line: Building for the next billion users requires an infrastructure that respects the speed of human thought. Sarvam AI provides the native Indic engine; your job is to build the Deterministic House that keeps the order safe.



Source link