Cloud-Native Lakehouse

Cloud-Native Lakehouse Cloud-Native Lakehouse
Cloud-Native Lakehouse

Unifies lake and warehouse on cloud-native architecture for elastic scale, ETL, governance, ML, and self-service BI—bringing faster analytics, decision-making, and innovation.

Unifies lake and warehouse on cloud-native architecture for elastic scale, ETL, governance, ML, and self-service BI—bringing faster analytics and innovation.
Product Advantages
Cloud Native Big Data

It has the advantages of storage and account separation, elastic scaling, flexible scheduling, etc. Independent accounting ensures system flexibility, flexible scaling and automatic matching of resources, flexible scheduling improves synergy efficiency, and provides enterprises with excellent innovation power.

Lake Warehouse One

Fully integrate the advantages of data lake and data warehouse, maintain high performance and easy management characteristics of data warehouse, and support multi-type data, acid transactions, high concurrency and other data lake characteristics. Realize the synergistic development of data lakes and data warehouses, and empower enterprises with more comprehensive and flexible data management capabilities.

Engine Innovation Upgrade

The platform takes openness as the design core and integrates and deeply optimizes the multi-computing engine. Spark, Flink and other engines have been refined to show more powerful computing performance. Provide enterprises with highly flexible, powerful and diverse computing support to help meet the challenges of data diversification brought about by the rapid development of business.

Core Capabilities
One Lake Warehouse, One Data Service Multiple Business Scenarios

Based on the cloud native architecture, the lake warehouse engine integrates the advantages of data warehouse and data lake, and unifies all structured, semi-structured, and unstructured data into the lake, achieving a perfect balance between high-performance management of data and flexible applications, so that a set of data can serve a variety of business scenarios. Support real-time query and OLAP analysis through a unified interface to ensure efficient and agile data sharing. The engine also has acid features that support data upload, modification, query and other operations to improve transactional processing capabilities, making it an excellent choice for all kinds of complex business needs.

Separate Deposits, Saving Multiples of Storage Resources

Using object storage and Kubernetes technology to build a big data computing separation architecture, decoupling storage and computing to achieve multiple storage resource savings. Scale on-demand compute and storage nodes to increase system flexibility, effectively meet large-scale data processing needs, and achieve higher performance and scalability. While helping enterprises optimize resources and reduce costs, it also enhances data security and isolation, and improves the resilience and competitiveness of enterprises.

Flexible Scaling, 100% Computing Resource Utilization

Introduce Flink Operator and Spark Operator to achieve automatic elastic scaling of computing resources, flexibly and dynamically allocate resources according to data tidal effects, avoid offline task preemption, and ensure optimal utilization of computing resources. Enable smart applications to request resources on demand to ensure efficient utilization, provide businesses with flexible responses to fluctuations in demand, and maximize the benefits of computing resources.

Unify Metadata and Empower All-Round Data Governance

Unified metadata enables efficient data search and discovery by comprehensively collecting and displaying data metadata, enabling users to quickly and accurately find the data resources they need. Support bloodline tracing, data quality monitoring, and repair to improve data credibility, traceability, consistency, and accuracy. Enhance control over sensitive information through metadata classification and tagging to effectively mitigate potential compliance risks.

Engine Enhancements Make the Compute Engine More Stable, Faster, and More Resilient

Apache Celeborn is used as an RSS tool to effectively solve the problems of disk fullness, network instability, and random IO stability that are common in large-data Flink and Spark jobs. The computing engine is more stable, and at the same time, it has achieved significant performance improvements in large data volume Shuffle, accelerating the execution speed of computing tasks and improving the quality of task operation.

Core Capabilities
One Lake Warehouse, One Data Service Multiple Business Scenarios

Based on the cloud native architecture, the lake warehouse engine integrates the advantages of data warehouse and data lake, and unifies all structured, semi-structured, and unstructured data into the lake, achieving a perfect balance between high-performance management of data and flexible applications, so that a set of data can serve a variety of business scenarios. Support real-time query and OLAP analysis through a unified interface to ensure efficient and agile data sharing. The engine also has acid features that support data upload, modification, query and other operations to improve transactional processing capabilities, making it an excellent choice for all kinds of complex business needs.

Separate Deposits, Saving Multiples of Storage Resources
Flexible Scaling, 100% Computing Resource Utilization
Unify Metadata and Empower All-Round Data Governance
Engine Enhancements Make the Compute Engine More Stable, Faster, and More Resilient
Application Scenarios
Cloud Native Lake Warehouse Cluster Construction

Business Pain Points

Mainstream data warehouse architectures need to support both real-time and offline computing modes, as well as corresponding storage methods. However, the real-time computing layer and the offline computing layer, the real-time storage layer and the offline storage layer are not uniform, and there is a split. This requires more hardware resources and effort to maintain the code.


Business Value

Simplify management: Simplify the management and maintenance of data lakes and data warehouses.

Focus on development: No need to focus on data conversion, focus on data development, and improve business insight.

Cost reduction and efficiency: The storage and accounting separation architecture uses object storage to make more efficient use of storage resources.

Cloud Native Lake Warehouse Cluster Construction

Business Pain Points

Mainstream data warehouse architectures need to support both real-time and offline computing modes, as well as corresponding storage methods. However, the real-time computing layer and the offline computing layer, the real-time storage layer and the offline storage layer are not uniform, and there is a split. This requires more hardware resources and effort to maintain the code.


Business Value

Simplify management: Simplify the management and maintenance of data lakes and data warehouses.

Focus on development: No need to focus on data conversion, focus on data development, and improve business insight.

Cost reduction and efficiency: The storage and accounting separation architecture uses object storage to make more efficient use of storage resources.