System design challenges scored by the community
| Problem | Tags | Created |
|---|---|---|
Design a scalable platform (like AWS/Vercel for AI agents) that enables any enterprise to self-service create, configure, deploy, and manage action-oriented, multimodal AI agents. The platform must support: - Strict tenant isolation - Bring-your-own API keys and integration with common LLM providers (OpenAI, Anthropic, Gemini, MiniMax, etc.) - Automated billing - Enterprise-grade security (SSO, RBAC, audit logs, compliance) - Deep observability (real-time logs, metrics, traces, dashboards) - Multi-tool, API, and MCP integration for data analysis, workflow automation, and custom actions - Global, multi-region deployment, failover, and high availability - Dynamic LLM routing based on task complexity, speed, and cost - Low latency (<2s) - Strict logical separation of tenant data, analytics, and credentials - Self-service agent creation and instant deployment - User-facing dashboard for managing agents, analytics, configurations, and billing - Centralized admin dashboard for cross-tenant management, quota enforcement, and system health | ai-platformenterpriseagent+5 | Mar 14, 2026 |
Design a search and retrieval engine that ingests data from multiple SaaS sources (Google Drive, Slack, Jira, Notion, Confluence, GitHub, Salesforce) and powers LLM-generated answers with strict permissioning — similar to Glean. The platform should support: - Real-time indexing (document updates reflected in < 1 minute) - Hybrid search (Keyword BM25 + Vector Semantic Search + learned re-ranking) - Document-level and fragment-level Access Control Lists (ACLs) mirrored from source systems - Citation tracking (linking every answer sentence back to its source fragment) - Multi-tenant architecture with strict data isolation (no cross-tenant data leakage) - Handling diverse data formats (PDF, HTML, DOCX, code, chat logs, spreadsheets, images with OCR) - Connector framework: pluggable adapters for each SaaS source with incremental sync - Chunking pipeline: intelligent document splitting (respecting headings, tables, code blocks) - Query understanding: intent detection, entity extraction, query rewriting for better recall - Personalization: rank results higher based on user's team, role, and interaction history - Admin console: per-tenant analytics, connector health, index coverage dashboards - Feedback loop: thumbs-up/down on answers to fine-tune re-ranking and retrieval quality Scale target: 5,000 enterprise customers, 1PB of indexed text, 50K queries/sec peak, average 200 documents per query fan-out, p99 end-to-end answer latency < 3 seconds. | vector-databasedata-ingestionsecurity+4 | Feb 15, 2026 |
Design a web application similar to ChatGPT that allows users to have multi-turn conversations with large language models. The platform should support: - User authentication and session management - Multi-turn conversation threads with context retention - Real-time streaming of LLM responses (token by token) - Conversation history with search and organization - Multiple model selection (different LLM backends) - Rate limiting and usage quotas per user/tier - Markdown rendering in responses (code blocks, tables, etc.) - File upload and multimodal input (images, documents) - Sharing conversations via public links - Admin dashboard for monitoring usage and costs Scale target: 20 million daily active users, 500 million messages per day, average conversation length of 10 turns. | web-applicationreal-timestreaming+1 | Feb 14, 2026 |