Qwen3.6-35B-A3B brings local agentic coding to everyone
For engineers, designers & product people. Stay up to date with free daily digest.
TLDR: New Qwen local coding model, Anthropic’s Opus 4.7, and OpenAI’s Codex refresh all push harder on reliable, agentic workflows, while AWS leans into formal verification.
Qwen3.6-35B-A3B opens high-end agentic coding to local setups
Alibaba’s Qwen team released Qwen3.6-35B-A3B, a 35B-parameter coding model optimized for agentic workflows and now generally available for local and cloud deployment as of 2026-04-17. The Hacker News launch post highlights strong reasoning and tool-use capabilities plus competitive coding performance without needing frontier-scale hardware.
For agent builders, the draw is a model explicitly tuned for multi-step coding, planning, and tool orchestration that you can realistically run on a high-end workstation or modest server. That puts “autonomous” repo refactors, code review assistants, and CI-integrated agents within reach without paying frontier model rates. It is early: there is limited independent benchmarking and you will still need careful sandboxing for write-access flows.
Worth noting: Simon Willison reports Qwen3.6-35B-A3B running on his laptop outperforming Claude Opus 4.7 on his informal “pelican on a bicycle” visual benchmark, which at least suggests solid multimodal behavior.
Anthropic’s Claude Opus 4.7 targets advanced software engineering agents
Anthropic introduced Claude Opus 4.7, claiming a 14 percent improvement on complex multi-step workflows over Opus 4.6 with fewer tokens and roughly one-third the tool errors, as of 2026-04-17. Enterprise users like Box report 56 percent fewer model calls, 50 percent fewer tool calls, 24 percent faster responses, and 30 percent lower AI unit usage versus Opus 4.6.
The focus is disciplined, controllable autonomy: Opus 4.7 adds better planning, more reliable tool calling, and improved recovery from tool failures so orchestrator agents can keep going when APIs misbehave. This matters if you run production agents inside products like Notion, Replit, or your own internal dev tools, where silent tool-call flakiness is a real cost. The first model to pass Anthropic’s implicit-need tests should also reduce cases where the model “helpfully” does the wrong thing.
You will still need sandboxing and guardrails for write-heavy agents, but Opus 4.7 looks like a pragmatic upgrade path for existing Claude pipelines, not a full re-architecture.
Also covered by: VentureBeat, 9to5Mac
OpenAI expands Codex and launches GPT-Rosalind for life sciences
OpenAI updated the Codex desktop app for macOS and Windows to add computer use, in-app browsing, image generation, long-term memory, and plugins for developer workflows as of 2026-04-17. Codex now looks more like a general-purpose agent shell than a code-only assistant, with the ability to operate your IDE and browser to execute multi-step tasks.
In parallel, OpenAI launched GPT-Rosalind, a frontier reasoning model focused on drug discovery, genomics, protein reasoning, and broader life-science workflows. If your team builds agents for biotech or lab automation, Rosalind’s domain-specific priors could significantly reduce prompt engineering and post-hoc verification, although you will still need regulatory-grade validation.
The big picture for engineers: Codex becomes an on-device orchestrator that can chain tools, while Rosalind signals a deeper specialization trend for high-value verticals. Both push toward agents that not only suggest actions but execute workflows end to end.
Quick Hits
How Automated Reasoning checks in Amazon Bedrock transform generative AI compliance Formal verification components in Amazon Bedrock aim to give mathematically provable guarantees over certain AI outputs for regulated industries, which could be useful if you build agents for healthcare, finance, or gov workloads.
Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference Walks through fine-tuning Amazon Nova Micro for custom SQL dialects, targeting cheaper text-to-SQL agents that still hit production reliability constraints.
Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis Agent-cache adds a multi-tier exact-match cache for LLM responses, tool outputs, and session state on vanilla Valkey or Redis, with adapters for LangChain, LangGraph, and Vercel AI SDK to cut agent latency and cost.
Launch HN: Kampala (YC W26) – Reverse-Engineer Apps into APIs Kampala is a MITM proxy that records real user traffic across web, mobile, and desktop apps so you can expose stable pseudo-APIs for agents without brittle RPA or UI automation.
Transform retail with AWS generative AI services AWS pitches generative AI tooling for virtual try-on and personalization, useful if you are wiring agents into retail discovery, sizing, or post-purchase support flows.
The PR you would have opened yourself Hugging Face ships a Skill plus test harness to auto-port transformers models to Apple’s mlx-lm stack, lowering friction to run community models efficiently on Apple silicon.
Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers Guide to training multimodal embeddings and rerankers for retrieval-augmented generation, semantic search, and reranking, with practical recipes for extending Sentence Transformers.
datasette 1.0a28 Alpha release of Datasette 1.0a28 fixes regressions from 1.0a27, relevant if you use Datasette as a lightweight backend or analytics surface behind internal agents.
Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 Anecdotal but fun comparison of Qwen3.6-35B-A3B versus Claude Opus 4.7 on a quirky multimodal task, plus hardware notes for local experimentation.
More from the Digest
For engineers, designers & product people. Stay up to date with free daily digest.