Skip to content

Commit 109fb78

Browse files
authored
docs: update introduction (#10255)
1 parent be687df commit 109fb78

1 file changed

Lines changed: 29 additions & 85 deletions

File tree

docs/pages/product/introduction.mdx

Lines changed: 29 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
# Introduction
22

3-
Cube is the [agentic analytics](#agentic-analytics) platform built on top of the [semantic layer](#semantic-layer).
4-
5-
## Agentic analytics
3+
Cube is the [agentic analytics](#agentic-analytics) platform built on top of the [open-source semantic layer](#semantic-layer).
64

75
Cube enables AI agents and users to query, explore, and manipulate data models — transforming the semantic layer into a dynamic, governed workspace for generating insights, automating workflows, and building data products.
86

@@ -20,127 +18,73 @@ Cube is a new generation of a BI platform built to be used by both humans and AI
2018

2119
With Cube, you can power copilots, automate data workflows, and create interactive analytics experiences—all grounded in a consistent and governed data model.
2220

23-
<InfoBox>
24-
25-
End users can access Cube's agentic analytics capabilities in [Workbooks][ref-workbooks].
26-
27-
You can also bring agentic analytics to your own applications by using the [embedding][ref-embedding]
28-
capabilities, the [Chat API][ref-chat-api], and the [MCP server][ref-mcp-server].
29-
30-
To familiarize yourself with the core concepts behind agentic analytics, take a look at our guides on
31-
[spaces, agents, models][ref-spaces-agents-models], [agent rules][ref-agent-rules], and [agent memories][ref-agent-memories].
32-
33-
</InfoBox>
34-
3521
## Semantic layer
3622

37-
Cube is a universal semantic layer that represents the next evolution of OLAP technology for the cloud data platform era. Born in the cloud, Cube bridges the gap left when traditional OLAP capabilities from legacy specialized servers were not fully translated to modern cloud data platforms.
38-
39-
As data infrastructure evolved from traditional relational databases to cloud data warehouses, the need for multidimensional analysis, consistent metrics, and performance optimization remained. Cube addresses these challenges by making it easy to connect data silos, create consistent metrics, and make them accessible to any data experience your business or your customers needs.
23+
At the foundation of Cube's agentic analytics platform is an [open-source semantic layer](https://github.com/cube-js/cube)—the critical infrastructure that enables both AI agents and humans to work with trusted, consistent data.
4024

41-
Data engineers and application developers use Cube's developer-friendly platform to organize data from your cloud data warehouses into centralized, consistent definitions, and deliver it to every downstream tool via its APIs.
25+
The semantic layer provides the governed data foundation that makes agentic analytics possible. It organizes data from your cloud data warehouses into centralized, consistent definitions that AI agents can reliably query, explore, and reason about. Without a semantic layer, AI agents would struggle with inconsistent metrics, scattered business logic, and ungoverned data access—making their outputs unreliable and potentially dangerous.
4226

43-
Your business data becomes consistent, accurate, easy to access, and, most importantly, trusted.
44-
Once trusted, the use of data accelerates throughout your organization, delivering better experiences
45-
to your customers and driving intelligence back into the business.
27+
By establishing a single source of truth for metrics, relationships, and business logic, the semantic layer ensures that AI agents and users work with the same trusted definitions. This consistency is essential for agentic analytics: when an AI agent generates insights or automates workflows, it relies on the semantic layer's data model to understand what metrics mean, how entities relate, and what data users are authorized to access.
4628

47-
<Diagram src="https://ucarecdn.com/8d945f29-e9eb-4e7f-9e9e-29ae7074e195/" />
29+
The semantic layer also provides the performance and governance infrastructure needed for agentic workflows. Through caching and pre-aggregations, it ensures AI agents can respond quickly without overwhelming your data warehouse. Through access controls, it guarantees that agents respect the same data security policies as human users.
4830

49-
With Cube, you can build a data model, manage access control and caching, and expose your data to every application
50-
via REST, GraphQL, and SQL APIs. With these APIs, you can use any charting library to build custom UI, connect existing dashboarding and reporting tools, and build AI-powered data applications.
31+
Data engineers use Cube's semantic layer to build and maintain data models, manage access control and caching, and expose data through REST, GraphQL, and SQL APIs—creating the governed foundation that powers agentic analytics experiences, traditional BI tools, and custom data applications.
5132

5233
### Code-first
5334

54-
Throughout the evolution of software engineering, numerous tools and methodologies have been developed to effectively handle codebases of all sizes.
55-
These include [version control systems](https://git-scm.com/) for seamless collaboration and code reviews,
56-
infrastructure for testing and documentation, as well as [established patterns](https://en.wikipedia.org/wiki/Design_Patterns) and
57-
best practices to structure codebases for reusability and maintainability.
35+
A code-first approach is essential for both traditional data engineering and agentic analytics. Managing data models, configurations, and policies as code enables the same proven practices that power modern software development: version control for collaboration and code reviews, automated testing and documentation, and established patterns for reusability and maintainability.
5836

59-
At Cube, we firmly believe that the future of data engineering lies in the application of these proven practices and tools to data management.
60-
By doing so, we can facilitate collaboration at scale and create high-quality data products that are easily maintainable.
37+
For agentic analytics specifically, a code-first semantic layer creates new possibilities. AI agents can help curate and maintain data models themselves, accelerating development while maintaining quality through git workflows. The structured, version-controlled nature of code makes it easier for agents to understand changes, suggest improvements, and even implement modifications autonomously.
6138

62-
The foundation of this approach lies in adopting a code-first workflow.
63-
That's why everything within Cube, from configurations to data models, is meticulously managed through code.
39+
Everything within Cube—from configurations to data models to access control policies—is managed through code. This foundation enables both human data engineers and AI agents to collaborate on building and maintaining the semantic layer that powers agentic analytics.
6440

6541
### Four pillars of semantic layer
6642

67-
We believe that a complete, universal semantic layer should have the following four pillars: data model, caching, access controls, and APIs. These pillars address the core challenges that OLAP technology was originally designed to solve, but in a modern, cloud-native way.
43+
The semantic layer that powers Cube's agentic analytics platform is built on four essential pillars: data modeling, access control, caching, and APIs. Each pillar plays a critical role in enabling AI agents and users to work with data reliably, securely, and efficiently.
6844

6945
#### Data Modeling
7046

71-
**Data modeling framework is a foundational piece of the universal semantic layer.** It helps data teams to centralize data models upstream from
72-
data consumption tools, such as BIs, embedded analytics applications, or AI agents. It makes your data architecture DRY
73-
([Don't Repeat Yourself](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)) by reducing the repetition of data modeling across multiple presentation layers.
47+
**The data model provides the knowledge graph that AI agents use to understand your business.** It centralizes metric definitions, entity relationships, and business logic upstream from all consumption tools—whether those are AI agents, BI tools, or custom applications. This centralization is critical for agentic analytics: AI agents need a structured understanding of what metrics mean, how entities relate, and what calculations are valid.
7448

75-
While modern cloud data platforms excel at processing large volumes of data, they lack native support for multidimensional analysis and modeling that traditional OLAP servers provided. Cube brings OLAP-style analytics to these platforms, enabling consistent metric definitions and multidimensional analysis.
49+
When an AI agent analyzes sales performance or answers questions about customer behavior, it relies on the semantic layer's data model to understand that "revenue" is calculated consistently, that customers have orders, and that orders contain line items. This structured knowledge enables agents to generate reliable insights and navigate complex data relationships autonomously.
7650

77-
**Cube data model is code-first.** Data teams define data models with YAML or JavaScript code.
78-
The codebase is commonly managed with a version control system. Cube enables git flow for
79-
changes to data model and managing multiple isolated environments per project.
51+
**Cube's data model is code-first.** Data teams define data models with YAML or JavaScript code, managed through version control systems. This enables AI-assisted development where agents can help curate and maintain the semantic layer itself, accelerating model development while maintaining quality through git workflows and multiple isolated environments.
8052

81-
**Cube data model is dataset-centric.** It is inspired by and expands upon dimensional modeling.
82-
Cube provides a practical framework for implementing dataset-centric data modeling.
53+
**Cube's data model is dataset-centric**, inspired by and expanding upon dimensional modeling. You work with two types of objects:
8354

84-
When building a data model in Cube, you work with two dataset-centric objects: **cubes** and **views**.
85-
**Cubes** represent business entities such as customers, line items, and orders. In cubes,
86-
you define all the calculations within the measures and dimensions of these entities.
87-
Additionally, you define relationships between cubes, such as "an order has many line items" or "a user may place multiple orders."
55+
**Cubes** represent business entities such as customers, line items, and orders. They define all calculations within measures and dimensions, as well as relationships between entities. These relationships form the knowledge graph that AI agents traverse when exploring data and generating insights.
8856

89-
**Views** sit on top of a data graph of cubes and create a facade of your entire data model,
90-
with which data consumers can interact. You can think of views as the final data products for your
91-
data consumers - BI users, data apps, AI agents, etc. When building views, you select measures and dimensions
92-
from different connected cubes and present them as a single dataset to BI or data apps.
57+
**Views** sit on top of the data graph of cubes, creating facades that data consumers interact with. Think of views as the final data products for AI agents, BI users, and applications. Views select measures and dimensions from connected cubes and present them as unified datasets, providing AI agents with the right context and scope for specific analytical tasks.
9358

9459
#### Access Control
9560

96-
**One of the benefits of semantic layer is the active security layer.**
97-
Semantic layer provides a comprehensive real-time understanding and governance of your data.
98-
When all your data consumption tools access data through the semantic layer, it becomes an ideal place to enforce access control policies.
61+
**Access control ensures that AI agents respect the same data security policies as human users.** This is critical for agentic analytics: when AI agents autonomously query and analyze data, they must enforce the same governance rules that apply to human users—whether that's row-level security, column-level restrictions, or data masking.
62+
63+
By centralizing access control in the semantic layer, you ensure that all data consumption—whether by AI agents, BI tools, or custom applications—goes through a single governed checkpoint. This provides comprehensive oversight and prevents agents from inadvertently exposing sensitive data or violating security policies.
9964

100-
Cube provides infrastructure to define different access control policies and patterns,
101-
including row-level and column-level security, data masking and more. Being a code-first,
102-
Cube enables data teams to **define access control policies with Python or JavaScript.**
103-
They can range from simple row-level access rules to completely custom data models per tenants backed by different data sources.
65+
Cube's code-first approach enables data teams to **define access control policies with Python or JavaScript**, ranging from simple row-level access rules to completely custom data models per tenant backed by different data sources. These policies apply uniformly to all consumers of the semantic layer, ensuring AI agents operate within the same security boundaries as human users.
10466

10567
#### Caching
10668

107-
The semantic layer can serve as a buffer to the data sources, protecting the cloud data warehouses from unnecessary and redundant load.
108-
Caching optimizes performance and can reduce the cloud data warehouse cost.
69+
**Caching enables AI agents to deliver fast, interactive experiences without overwhelming your data infrastructure.** For agentic analytics to be effective, AI agents must respond quickly to user questions, iteratively explore data, and generate insights in real-time. Without caching, every agent query would hit your data warehouse directly, creating latency issues and potentially significant costs.
10970

110-
While cloud data warehouses have improved query performance through column-oriented storage and distributed processing, they still struggle with complex analytical workloads. This is where Cube's caching layer addresses the performance challenge that traditional OLAP servers were designed to solve.
71+
The semantic layer acts as a performance buffer between AI agents and your data sources. Through intelligent caching, it ensures agents can work interactively while protecting your cloud data warehouse from unnecessary and redundant load.
11172

112-
Cube implements caching through the **aggregate awareness framework called pre-aggregations.**
113-
Data teams can define pre-aggregates in the data model as rollup tables, including measures and dimensions.
114-
Cube builds and refreshes these pre-aggregates in the background by executing queries in your cloud data warehouse
115-
and storing results in Cube Store, Cube's purpose-built caching engine backed by distributed file storage, such as S3.
116-
Pre-aggregations can be refreshed on schedule or as a part of the workflow orchestration DAG.
73+
Cube implements caching through an **aggregate awareness framework called pre-aggregations.** Data teams define pre-aggregates in the data model as rollup tables, including measures and dimensions. Cube builds and refreshes these pre-aggregates in the background by querying your cloud data warehouse and storing results in Cube Store, Cube's purpose-built caching engine backed by distributed file storage such as S3. Pre-aggregations can be refreshed on schedule or as part of workflow orchestration.
11774

118-
When you send a query to Cube, it will use aggregate awareness to see if an existing and fresh pre-aggregate is
119-
available to serve that query. It can significantly speed up queries and reduce the load and cost of cloud data warehouses.
75+
When an AI agent sends a query to Cube, the aggregate awareness engine determines if an existing and fresh pre-aggregate can serve that query. This significantly accelerates agent responses and reduces both latency and data warehouse costs—essential for enabling the iterative, exploratory workflows that characterize agentic analytics.
12076

12177
#### APIs
12278

123-
One of the key requirements of the semantic layer is **interoperability with data consumption tools**: BIs, embedded analytics, and AI agents.
124-
The universal semantic layer cannot require one-off integration with every tool, framework, or library.
125-
It is not feasible to support the ever-growing number of data consumption tools in a one-to-one model.
126-
127-
Legacy OLAP tools were limited in how they exposed data. Cube provides both modern APIs and support for traditional OLAP interfaces, making it a truly universal semantic layer.
128-
129-
Rather than inventing its own communication language or protocol, **the semantic layer must adhere to existing protocols and
130-
API standards** to ensure universal interoperability.
79+
**APIs enable AI agents, applications, and tools to interact with the semantic layer through standard protocols.** For agentic analytics to work across diverse use cases—from AI-powered workbooks to embedded analytics to traditional BI—the semantic layer must provide universal interoperability. AI agents need to query data, introspect the data model, and integrate with other systems without requiring custom integrations for every tool or framework.
13180

132-
Cube embraces and implements the three most commonly used protocols and API standards: **REST, GraphQL, and SQL.**
81+
Rather than inventing proprietary protocols, Cube implements widely adopted standards: **REST, GraphQL, and SQL.**
13382

134-
**REST and GraphQL** are commonly used in software development as a communication layer between the backend server and the frontend visualization layer.
83+
**REST and GraphQL** provide modern API interfaces for building custom applications and enabling programmatic access. These APIs power agentic workflows, allowing AI agents to query data, retrieve results, and build interactive experiences.
13584

136-
**SQL** is universally adopted across all the tools in the data stack. Every BI and visualization tool can query a SQL data source.
137-
That makes SQL an obvious choice for a communication layer to ensure interoperability. Cube implements Postgres SQL and extends
138-
it to support data modeling in the semantic layer. Cube adds the notion of **measure** to SQL spec, a special type that knows how to
139-
evaluate itself based on the definition in the data model. Every BI and visualization tool that can connect to Postgres or Redshift can connect to Cube.
85+
**SQL** is universally adopted across the data stack. Every BI tool, visualization platform, and data application can query a SQL data source. Cube implements Postgres-compatible SQL and extends it to support semantic layer concepts like measures—special types that know how to evaluate themselves based on data model definitions. Any tool that can connect to Postgres or Redshift can connect to Cube, making the semantic layer accessible to both AI agents and traditional analytics tools.
14086

141-
Finally, Cube exposes **robust meta API for data model introspection.** It is vital to achieve interoperability because
142-
it enables other tools to inspect the data model definitions and take actions, e.g. provide context to the AI agents querying the semantic
143-
layer or create the necessary mappings in a BI tool to data model objects.
87+
**Data model introspection through the meta API** is essential for agentic analytics. It enables AI agents to discover available metrics, understand entity relationships, and determine valid queries—providing the context agents need to navigate the semantic layer autonomously. This same introspection capability allows BI tools to automatically map to data model objects and helps applications build dynamic interfaces.
14488

14589

14690
[ref-workbooks]: /product/workspace/workbooks

0 commit comments

Comments
 (0)