automation – Page 2

TECH & AI

Claude Security Update: Scans, Webhooks, 6 Partners

jackminion May 27, 2026 0

Claude Security left its launch behind with scheduled scans, directory targeting, and CSV or Markdown exports.
Slack and Jira webhooks plus dismissals that stick turn a one-off scan into a weekly review loop.
Six security platforms now build on Opus 4.7, from CrowdStrike and Wiz to Microsoft Security.
It stays Enterprise-only in beta, so here is what a solo studio runs in its place today.

When Claude Security reached public beta about a month ago, it was a sharp scanner wrapped around a thin workflow. You pointed it at a repository, it reasoned through the code the way a security researcher would, and it handed back findings with suggested patches. Useful, but hard to live with day to day. The version sitting in the Claude.ai sidebar this week is a different animal. Scheduled scans, webhooks into Slack and Jira, directory-level targeting, and six security platforms now wiring the same Opus 4.7 model into their own tools.

From Launch to Workflow: What Actually Shipped

At launch the pitch was simple. Scan a repo, explain the vulnerability in plain language instead of a raw CVE dump, propose a fix, and leave the decision to a human. The public-beta launch a month ago covered that first version. The gap was everything around the scan.

This week that gap is mostly filled. You can schedule scans on a cadence instead of running them by hand, which matters because security debt accrues quietly between releases. You can target a single directory inside a large monorepo rather than waiting on the whole tree, so a focused review of the payments module finishes in minutes instead of an hour. You can export findings as CSV or Markdown and drop them straight into an existing tracker or an audit trail. And you can dismiss a finding with a documented reason, with that dismissal persisting across future runs so you are not re-triaging the same false positive every week.

Underneath all of it is a multi-stage validation pipeline. Each finding is checked before it ever reaches you, and every one carries a confidence rating. That validation step is the part that decides whether a scanner is worth keeping, because a tool that cries wolf gets muted within a week. The model reads imports, follows data flow, and reasons about whether a flagged pattern is actually reachable, which is the kind of judgment a regex-based scanner cannot make. You reach the whole thing from the Claude.ai sidebar or at claude.ai/security, with no API integration and no custom agent to build.

In practice the findings cluster around the same few classes: injection through unsanitised input, broken authorization checks, secrets committed by accident, server-side request forgery, and unsafe deserialization. The directory targeting is what makes that tractable on a real codebase. Instead of scanning a 200,000-line monorepo and drowning in a single report, you can scope a run to the service you just changed, review it, and move on. A scoped scan that finishes while you are still in the context of the change is a scan you will actually read.

The Webhook and Dismissal Loop

Two features do the real work of turning a scanner into a habit: webhooks and persistent dismissals.

Webhooks push results into Slack, Jira, or anything that accepts a hook. A scan becomes a ticket without a single copy-paste, and the finding lands where the team already works instead of in a dashboard nobody opens. Persistent dismissals mean a finding you reviewed and rejected stays gone instead of resurfacing on the next pass, which is the single biggest source of fatigue with older tools.

Put them together and you get a loop. Scan on a schedule, surface only the new findings, route them to wherever your team lives, dismiss the noise with a reason, and let the next scan respect that choice. That loop is the entire difference between a tool you run once for a screenshot and one you run every Friday.

It is also where the contrast with the rule-based generation shows. Snyk, Dependabot, and GitGuardian are good at matching known signatures and flagging dependencies with published advisories. They are far less good at explaining why a specific code path in your own logic is exploitable, and they tend to bury the signal under a wall of severity badges. Confidence ratings plus dismissals let you set a noise floor, so only the findings worth a human minute get through. The promise is fewer alerts, each one carrying more context.

Six Platforms Now Run on Opus 4.7

The bigger move is who is building on it. CrowdStrike, Microsoft Security, Palo Alto Networks, SentinelOne, TrendAI, and Wiz are embedding Opus 4.7 into their own security products. On the services side, Accenture, BCG, Deloitte, Infosys, and PwC are deploying Claude-integrated security work for their clients. Anthropic also opened a Cyber Verification Program for organisations doing high-risk cybersecurity work who need access to safeguarded capabilities.

Last week added the governance half. The Compliance API, announced on May 21, exposes Claude Enterprise and Platform activity (prompts, responses, uploaded files, logs, and admin actions) to external security and governance tools. That is the unglamorous piece a security team needs before it will let any model near production code, because without an audit surface the model is a black box the compliance officer cannot sign off on.

The partner news matters even if you never touch the Enterprise product directly. When Wiz or CrowdStrike wires Opus 4.7 into a scanner you already run, the model’s reasoning reaches your pipeline through a tool you have already paid for and trust. That is the quieter distribution story. Not everyone signs up for Claude Security, but a lot of teams will end up running it without ever leaving the dashboard they know.

Read together, this is Anthropic positioning Claude as a layer that security vendors build on, not just a standalone scanner racing the incumbents. It rhymes with Anthropic’s wider cybersecurity bet, where the model is the engine and other companies ship the product on top of it.

What a Solo Studio Can Actually Use Today

Here is the honest part. Claude Security is an Enterprise public beta. Team and Max access is listed as coming soon, and there is no Pro tier in the announcement. As a one-person studio I cannot point it at my repositories yet, and I am not going to pretend otherwise.

So this is what I actually use in its place. Claude Code ships a built-in security review you run with the /security-review command, which makes an on-demand pass over a diff and flags issues before they land. There is also a Claude Code security action for GitHub that reviews pull requests automatically and leaves findings as inline comments on the PR. Both run for individual developers right now, both reason about code the same way the Enterprise product does, and both keep a human approving every patch.

My setup is small. The GitHub action runs on every pull request to main and comments anything it finds, so review happens before merge without me remembering to trigger it. When I am touching auth, payments, or anything that handles a token, I run the review command locally first and read the reasoning, not just the verdict. It catches the boring but dangerous things: a secret about to be committed, an unescaped query, a missing check on a webhook signature. Last month it stopped me from shipping a webhook endpoint that trusted its payload without verifying the signature header, the kind of mistake that reads as fine in review and bites in production. The reasoning, not just the flag, is what made me fix it properly instead of papering over it.

It is worth being clear about what this does not replace. It reasons about your own code, so it complements rather than supplants dependency scanning for known advisories, secret rotation, and the rest of a real security posture. Treat it as a very good reviewer, not a finished program. It is not the scheduled, webhook-routed, dismissal-tracking product either. It is the same instinct at solo scale, and the habits carry straight over if Team access lands the way it is promised.

Bottom Line

Claude Security went from a demo to a workflow in about a month. The scanner was never the hard part. Scheduled runs, dismissals that stick, and webhooks that file the ticket are what make a security tool something you keep instead of something you screenshot once. The model underneath is now shared by six security platforms and a governance API, which says more about the strategy than any single feature does.

For now it sits behind the Enterprise tier, so solo builders get the same engine through Claude Code review instead. Wire the GitHub action into your pull requests, run the review command before you touch anything sensitive, and watch the Team and Max rollout. Read the rest of the Claude coverage in the Lab while you wait.

Source link

TECH & AI

The Pareto Principle applied to software engineering

jackminion May 14, 2026 0

How to create an automated web browser:

🔹 Study Electron.js🔹 Study React.js🔹 Make the two play nice together🔹 Abandon react-redux and write your own🔹 Create an SDK, and make it self-documenting🔹 Make sure all events are native (isTrusted flag is true)🔹 Create an IDE from scratch🔹 Make the SDK and IDE play nice together🔹 Create a modules system🔹 Create a public API for the modules🔹 Create an I/O system🔹 Create a command-line interface

And even though you obsessed over UX all this time, the app still feels uninviting, for lack of a better word.

👉 Then one weekend, add the ability to record automations with point-and-click. 🪄 Boom! Magic.

This is some version of the Pareto Principle in action. Even though a minority of actions produce the majority of the consequences, you really can’t skip the work.

I could not have added the recording feature without first building everything else.

And moving forward, this pattern will repeat. Most of the effort that goes into uindow will go unnoticed. But from time to time, users will say “wow” to magic that can only happen because of the invisible work.

Uindow is a free, and source-available automated web browser. You can check it out on GitHub.

Source link

TECH & AI

The 800ms Barrier: Architecting Interruptible Voice Agents (Lessons from Sarvam AI x Swiggy)

jackminion May 8, 2026 0

The 800ms Barrier: Architecting Interruptible Voice Agents (Lessons from Sarvam AI x Swiggy)The Signal: The 800ms Latency BarrierIn a research lab, a 3-second delay is an “optimization ticket.” In a live call with a hungry customer on the Swiggy app, 3 seconds is a churn event.

The partnership between Sarvam AI and Swiggy represents a shift in the “Boss Level” of agentic AI. Most developers build voice agents using a Cascaded Pipeline: STT -> LLM -> TTS. The result? A cumulative lag that makes the agent feel like a slow walkie-talkie. To build for the next billion users, you have to architect for Native Audio Streaming and sub-second response times.

Phase 1: The Architectural BetWe are moving from Request-Response to Streaming State Machines.

The Vendor Trap is relying on general-purpose, text-centric models for a multilingual, audio-first market. If you have to translate “Hinglish” to English just to understand an order, you’ve already lost the latency battle.

The Ownership Path is the Indic-Native Stack. Using Sarvam’s natively trained audio models allows us to process speech-to-intent directly. More importantly, we must implement a Bi-Directional WebSocket architecture. This allows the agent to “listen” while it “speaks”—the only way to handle the most difficult part of human conversation: The Barge-in.

Phase 2: Implementation (The Interruptible Voice Handler)In a high-stakes environment like Swiggy, the agent must be able to stop mid-sentence and roll back its logic if the user changes their mind.

// High-Level Logic for an Interruptible Voice Kernel
class VoiceAgentKernel {
constructor(wsConnection) {
this.ws = wsConnection;
this.isSpeaking = false;
this.transactionLock = null; // Ensuring tool-use safety
}

// Detecting the “Barge-in” (Interruption)
onUserSpeechDetected() {
if (this.isSpeaking) {
console.warn(“SIGNAL: Interruption detected. Executing State Rollback.”);
this.killAudioPlayback();
this.abortCurrentLLMGeneration();
this.clearPendingTransactions();
}
}

async handleAudioStream(chunk) {
// Stream raw audio to Sarvam’s native Indic-pipeline
const response = await this.ws.processAudio(chunk);

if (response.intent_confidence > 0.9) {
// Pre-warm tools before the user even stops talking
this.prepareOrderTransaction(response.entities);
}
}

clearPendingTransactions() {
// Essential: Prevents the “Ghost Order” bug
if (this.transactionLock) {
this.transactionLock.cancel();
this.transactionLock = null;
}
}
}

Enter fullscreen mode

Exit fullscreen mode

Phase 3: The Senior Security & Testing AuditI put this Swiggy-scale blueprint through a professional Senior QA & Security Audit. Here is why your “standard” voice agent will fail in the wild.

The “Ghost Order” Race Condition (Logic Fault)The Fault: The agent says “Ordering your Paneer Tikka…” The user interrupts: “No, wait! Make it a Chicken Roll!”The Audit: In naive implementations, the “Order Tool” is triggered the moment the LLM starts talking. If the user interrupts, the audio stops, but the backend API has already committed the Paneer Tikka. You now have a frustrated customer and a wasted order.The Fix: Implement Deferred Commits. The tool-call must remain in a PENDING state until the audio playback reaches a “Commit Threshold” (e.g., 90% completion) or receives a final verbal confirmation.
The “Ambient Audio Injection” (Security Breach)The Fault: The user is ordering food while walking past a loud TV. The TV says “Cancel all orders.”The Audit: Without Speaker Diarization, the agent cannot distinguish between the primary user and background noise. A malicious or accidental “audio injection” can trigger unauthorized actions.The Fix: Use Sarvam’s front-end audio processing to enforce Voice Activity Detection (VAD) with a noise-floor gate. If the audio signal doesn’t match the primary speaker’s decibel profile or spatial characteristics, the kernel must ignore the intent.
The “Colloquial Logic Bypass” (Semantic Security)The Fault: Your security prompts are in English, but the user is speaking a dialect-heavy mix of Hindi and regional slang.The Audit: Traditional English-centric guardrails often miss the nuance of regional insults or “Hinglish” social engineering attempts used to trick the agent into granting a 100% discount.The Fix: Security filters must be Indic-Native. By using Sarvam’s regional guardrails, we ensure that semantic boundaries are enforced at the phoneme level, not just the translation level.

Phase 4: Checklist (The Architect’s Standard)( ) Native Audio or Bust: If you are still converting audio to text before processing intent, your latency will never hit the 800ms gold standard.

( ) Transactional Barge-in: Verify that every interruption triggers a State Rollback for any pending API calls.

( ) Acoustic Hardening: Test your agent against 60dB of background “street noise” to ensure VAD stability.

( ) Regional Edge-Cases: Audit your “Hinglish” logic. Does your agent understand the difference between a user “asking for a discount” and a user “threatening to cancel”?

The Bottom Line: Building for the next billion users requires an infrastructure that respects the speed of human thought. Sarvam AI provides the native Indic engine; your job is to build the Deterministic House that keeps the order safe.

Source link

DAILY NEWS