Greenplum (GPDB) is open source! ~
Greenplum Database (GPDB) is a shared-nothing large-scale parallel processing database, mainly used to handle large-scale data analysis tasks, including data warehouse, Business intelligence (OLAP) and data mining, etc. GPDB is specially designed for massive data analysis. It uses the most advanced cost-based query optimizer and is one of the most advanced open source databases at present. It can quickly and efficiently query and analyze petabyte-level data.
The commercial version database GreenPlum based on PostgresQL is officially open source. Its source code is now on GitHub: https://github.com/greenplum-db/gpdb. Database enthusiasts can refer to it more conveniently. Implementation of some advanced SQL query and analysis functions.
Greenplum database server software is an advanced, full-featured open source data warehouse management software. It provides powerful and efficient analysis capabilities for petabyte-scale data. Especially in the area of big data analysis, Greenplum Database is equipped with the world's most advanced query optimizer based on computing cost to achieve high query and analysis performance for big data.
The Greenplum open source project now uses the Apache 2 copyright agreement. Greenplum would also like to express its gratitude to community contributors and other enthusiasts for their contributions to its products. For the Greenplus community, any form of contribution to the product is very meaningful, and Greenplum also appreciates and encourages all forms of contributions.
"Open source massively parallel data warehouse"
Introduction to Greenplum Database
- Greenplum is developed based on PostgreSQL, and has also added many important innovative developments related to data warehouse operations:
- Large-scale parallel processing architecture: Greenplum's database automatically provides parallel processing for all data and queries. Capabilities;
- PB-level load processing capabilities: By using MPP technology, high performance can be maintained under high loads, and each rack can process up to 10T of data per hour.
- Innovative query optimizer: Greenplum is the first in the industry to design a query optimizer based on the cost priority principle for big data loads. It can implement PB-level processing in interactive mode or batch processing mode. Big data can be analyzed and processed without reducing query performance and data processing throughput.
- Polymorphic data storage and execution: The storage, execution and compression settings of tables or partitions can be flexibly configured according to the access method. When storing or processing row-level or column-level data, users can choose according to their needs.
- Advanced machine learning functions: After the introduction of the Apache MADLib library, the internal analysis functions are expanded in Greenplum Database through user-customized functions.
Related links:
1.Greenplum’s source code and documentation and related information: http://greenplum.org/
2.Greenplum’s source code: https://github.com /greenplum-db
3. Pivotal’s website for selfless contribution: https://pivotal.io/big-data/pivotal-greenplum
http://www.bkjia.com/PHPjc/1067481.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/1067481.htmlTechArticleGreenplum (GPDB) is open source! ~ Greenplum Database (GPDB) is a shared-nothing large-scale parallel processing database, mainly used to handle large-scale data analysis tasks, including data warehouse,...