- Published on
Scaling Custom AI Knowledge Apps at BlackRock
- Authors
- Name
- Ptrck Brgr
BlackRock’s investment operations teams face a unique challenge: they must build and deploy complex knowledge workflows fast, while satisfying strict compliance requirements. Traditional development approaches—spanning months of engineering—couldn’t keep pace with market and operational demands.
To solve this, BlackRock created a modular, Kubernetes-native AI framework that lets domain experts directly shape and deploy large language model (LLM)-powered apps. By splitting the workflow into a low-code “sandbox” for iteration and an automated “app factory” for deployment, they reduced delivery times from 3–8 months to just a few days.
Main Story
Investment operations are the unseen machinery behind portfolio decisions. They involve data acquisition, compliance, and post-trade processes—each with its own domain-specific complexity. BlackRock identified four AI application domains where LLMs could add value: document extraction, workflow automation, Q&A/chat, and agentic systems.
One example is “new issue operations”: setting up securities for IPOs, stock splits, or other events. This requires parsing lengthy prospectuses or term sheets, extracting structured data, validating it, and integrating it into downstream systems. Historically, each such app could take months to build, with multiple handoffs between business and engineering teams.
The new architecture changes that. The sandbox is a UI-driven environment where domain experts—without coding—can define extraction templates, set field-level rules, manage documents, and experiment with prompt strategies like retrieval-augmented generation (RAG) or chain-of-thought reasoning.
"If you can get that sandbox out into the hands of the domain experts then your iteration speed becomes really fast."
Once the extraction logic is finalized, the app factory—a cloud-native operator—automatically packages it into a production-ready app. This abstracts away infrastructure details, choosing the right compute profile (GPU vs. burstable) and managing deployment, access control, and scaling.
By integrating transformation and execution into the same environment, the system removes brittle CSV/JSON handoffs. The result is faster iteration, fewer errors, and smoother compliance checks.
Technical Considerations
For engineering leaders, the model surfaces several practical lessons:
- Prompt engineering at scale is a first-class problem. Financial documents are long and complex; prompts need version control, evaluation metrics, and collaborative editing
- LLM strategy selection matters. Factors like document size, complexity, and compliance needs determine whether to use RAG, in-context learning, or hybrid approaches
- Deployment automation must account for performance and cost. Matching workloads to compute profiles avoids over-provisioning while keeping latency within acceptable bounds
- Human-in-the-loop design is non-negotiable in regulated settings. The architecture supports validation checkpoints before outputs affect downstream systems
- Integration paths should eliminate manual post-processing. Embedding transformation logic into the same environment reduces operational friction
- Security and access control are built into the deployment layer, ensuring only authorized users can trigger sensitive workflows
This approach requires strong cross-functional skills: prompt engineering expertise for domain SMEs, infrastructure automation for platform teams, and governance frameworks for compliance officers.
Business Impact & Strategy
The shift from months-long to days-long delivery cycles changes the economics of AI in operations. Leaders can now:
- Accelerate time-to-value for internal tooling
- Reduce dependency on scarce engineering resources
- Standardize AI app development across multiple domains
- Improve ROI by reusing modular components across projects
For example, a process that once took up to 8 months now ships in a couple of days. This frees engineering capacity for higher-value work and allows operations teams to respond to market events in near real-time.
Risks include over-investing in custom tooling when off-the-shelf solutions suffice, and underestimating the training needed for domain experts to become effective prompt engineers. BlackRock mitigates this by assessing ROI before productionizing and by embedding education into the rollout.
Key Insights
- Empowering domain experts with low-code AI tools accelerates iteration and reduces engineering bottlenecks
- Clear frameworks for LLM strategy selection improve performance and compliance alignment
- Integrating transformation into extraction workflows eliminates brittle, manual post-processing
- Automated deployment profiles optimize cost-performance trade-offs
- Human-in-the-loop validation is essential in regulated industries
Why It Matters
For technical and business leaders, this case study shows that AI adoption in complex, regulated environments is not just about model choice—it’s about the surrounding architecture. Giving subject matter experts the ability to design and iterate on AI-powered workflows, while automating infrastructure concerns, creates a force multiplier for both speed and governance.
The sandbox–app factory pattern is transferable to other industries where document-heavy, compliance-bound processes are common. It aligns technical flexibility with business agility, without sacrificing control.
Conclusion
BlackRock’s approach demonstrates that scaling AI in a regulated enterprise is as much about process and platform as it is about the models themselves. By pairing domain-expert sandboxes with automated deployment, they’ve compressed delivery cycles, reduced friction, and maintained compliance.
For leaders exploring similar transformations, the key takeaway is to design for human-in-the-loop first, standardize your LLM strategies, and embed transformation into the core workflow. Watch the full discussion here: https://www.youtube.com/watch?v=08mH36_NVos