Pinecone has released Dedicated Read Nodes (DRN) in public preview to enable engineering teams to run high-scale, latency-sensitive vector database workloads with consistent performance and predictable costs.

Pinecone CEO Ash Ashutosh explains, "Dedicated Read Nodes is a breakthrough for organisations striving to make their AI applications knowledgeable at scale. By combining this new capability with our On-Demand service, Pinecone now ensures customers achieve the optimal cost-performance for any workload—from bursty RAG and agents to massive, multi-billion vector, high-throughput AI and agentic applications."
Dedicated Read Nodes
DRN works with Pinecone's serverless slab architecture to enable companies to run multiple AI applications, each with different requirements for throughput, latency, concurrency, and cost.
The architecture supports both bursty and steady workloads within a single system, helping engineering teams handle diverse AI tasks without reconfiguration or extra effort.
Teams can select the capacity mode that fits each use case as Pinecone support 100 million to a billion vector workloads with stable 20 to 100 millisecond latencies and thousands of sustained queries per second (QPS).
Benefits of DRN
Pinecone's DRN claims to offer engineering teams the following:
- Lower, more predictable cost helps engineering teams forecast expenses better, making high-QPS workloads more manageable and budget-friendly.
- Predictable low-latency and high throughput: Isolated read nodes and a warm data path (memory + local SSD) deliver consistent performance under heavy load.
- Scale for your largest workloads: Built for billion-vector semantic search and high-QPS recommendation systems. Add replicas to scale throughput; add shards to grow storage.
- No migrations required: Pinecone handles data movement and scaling behind the scenes.
