What's the latency budget for robot memory queries?
5–50ms for the hot path (reflexes stay local), 50–500ms for planning, 1–5s for strategic retrieval. bRRAIn's tiered memory (local buffer → master context → graph RAG) matches this budget.
Three tiers, three budgets
Robot memory has to answer three very different questions at three very different speeds. Reflex loops demand 5 to 50 milliseconds — the time between seeing an obstacle and refusing to step into it. Planning cycles can tolerate 50 to 500 milliseconds — enough to look up a route or a policy. Strategic retrieval — "what is our current policy on this type of intervention?" — lives in the 1 to 5 second range. A single memory store cannot serve all three. bRRAIn stacks three tiers to match each budget.
The hot path stays on the robot
Reflex decisions never leave the robot. They live in an on-device short-term buffer handled by the Embedded SDK runtime, close enough to actuators to meet the 5 to 50 millisecond window. Anything more ambitious — asking the graph for historical context, checking a policy update — belongs in a slower tier. Keeping reflexes local is how you prevent a network blip from becoming a safety incident. The SDK hides the complexity: the robot simply calls a local API and gets an answer inside the hot-path budget.
Master context serves planning
The middle tier is the consolidated master context file served by bRRAIn's Memory Engine / Handler. It is prebuilt by the Consolidator and delivered to the robot at boot or refresh, then cached in RAM for fast reads. Planning cycles hit this cache in the 50 to 500 millisecond window — well within typical robot planning intervals. The master context includes the environment model, the robot's role-scoped tool set, and recent fleet events, all assembled so the robot rarely needs to leave its cache to plan a next move.
The graph handles strategic retrieval
When a robot needs to answer a deeper question — why a policy exists, what happened last time this situation arose, which Sovereign made the last call — it queries the POPE Graph RAG layer. These retrievals live in the 1 to 5 second window because the graph is large and the query is rich. A robot rarely needs this tier inside a time-critical loop, but when it does, the graph delivers evidenced answers with provenance. All three tiers together form the complete memory stack for real-time robotics.
Relevant bRRAIn products and services
- Memory Engine / Handler — serves the master context at planning-budget speed.
- Consolidator — keeps the master context current so planning reads stay cached.
- POPE Graph RAG — strategic retrieval with full provenance in the 1 to 5 second tier.
- Embedded SDK — keeps the reflex-tier memory local and fast enough for actuator loops.
- SDK quickstart — wires the three tiers into a single robot stack.
- Architecture overview — the tiered memory design at a glance.