Based on the cloud native architecture, the lake warehouse engine integrates the advantages of data warehouse and data lake, and unifies all structured, semi-structured, and unstructured data into the lake, achieving a perfect balance between high-performance management of data and flexible applications, so that a set of data can serve a variety of business scenarios. Support real-time query and OLAP analysis through a unified interface to ensure efficient and agile data sharing. The engine also has acid features that support data upload, modification, query and other operations to improve transactional processing capabilities, making it an excellent choice for all kinds of complex business needs.
Using object storage and Kubernetes technology to build a big data computing separation architecture, decoupling storage and computing to achieve multiple storage resource savings. Scale on-demand compute and storage nodes to increase system flexibility, effectively meet large-scale data processing needs, and achieve higher performance and scalability. While helping enterprises optimize resources and reduce costs, it also enhances data security and isolation, and improves the resilience and competitiveness of enterprises.
Introduce Flink Operator and Spark Operator to achieve automatic elastic scaling of computing resources, flexibly and dynamically allocate resources according to data tidal effects, avoid offline task preemption, and ensure optimal utilization of computing resources. Enable smart applications to request resources on demand to ensure efficient utilization, provide businesses with flexible responses to fluctuations in demand, and maximize the benefits of computing resources.
Unified metadata enables efficient data search and discovery by comprehensively collecting and displaying data metadata, enabling users to quickly and accurately find the data resources they need. Support bloodline tracing, data quality monitoring, and repair to improve data credibility, traceability, consistency, and accuracy. Enhance control over sensitive information through metadata classification and tagging to effectively mitigate potential compliance risks.
Apache Celeborn is used as an RSS tool to effectively solve the problems of disk fullness, network instability, and random IO stability that are common in large-data Flink and Spark jobs. The computing engine is more stable, and at the same time, it has achieved significant performance improvements in large data volume Shuffle, accelerating the execution speed of computing tasks and improving the quality of task operation.
Based on the cloud native architecture, the lake warehouse engine integrates the advantages of data warehouse and data lake, and unifies all structured, semi-structured, and unstructured data into the lake, achieving a perfect balance between high-performance management of data and flexible applications, so that a set of data can serve a variety of business scenarios. Support real-time query and OLAP analysis through a unified interface to ensure efficient and agile data sharing. The engine also has acid features that support data upload, modification, query and other operations to improve transactional processing capabilities, making it an excellent choice for all kinds of complex business needs.
Business Pain Points
Mainstream data warehouse architectures need to support both real-time and offline computing modes, as well as corresponding storage methods. However, the real-time computing layer and the offline computing layer, the real-time storage layer and the offline storage layer are not uniform, and there is a split. This requires more hardware resources and effort to maintain the code.
Business Value
Simplify management: Simplify the management and maintenance of data lakes and data warehouses.
Focus on development: No need to focus on data conversion, focus on data development, and improve business insight.
Cost reduction and efficiency: The storage and accounting separation architecture uses object storage to make more efficient use of storage resources.
Business Pain Points
Mainstream data warehouse architectures need to support both real-time and offline computing modes, as well as corresponding storage methods. However, the real-time computing layer and the offline computing layer, the real-time storage layer and the offline storage layer are not uniform, and there is a split. This requires more hardware resources and effort to maintain the code.
Business Value
Simplify management: Simplify the management and maintenance of data lakes and data warehouses.
Focus on development: No need to focus on data conversion, focus on data development, and improve business insight.
Cost reduction and efficiency: The storage and accounting separation architecture uses object storage to make more efficient use of storage resources.







