Current location - Trademark Inquiry Complete Network - Trademark inquiry - The core software of big data platform is
The core software of big data platform is
The core software of big data platform is:

First of all, Phoenix

This is a Java middle tier that allows developers to execute SQL queries on Apache HBase. Phoenix is completely written in Java, the code is located on GitHub, and a JDBC driver that can be embedded by the client is provided.

Phoenix query engine will convert SQL query into one or more HBase scans and arrange execution to generate standard JDBC result sets. Directly using HBase API, coprocessor and custom filter, the performance level is millisecond for simple query and second for millions of rows.

Second, the stinger

Originally known as Tez, the next generation Hive Hortonworks led the development and ran on the YARN DAG computing framework. Under some tests, Stinger can improve the performance by about 10 times, and at the same time make Hive support more SQL. Its main advantages include: allowing users to get more query matches in Hadoop. Including OVER-like sentence analysis function, supporting WHERE query, making Hive-style system more in line with SQL model.

Third, soon.

Presto, an open source data query engine of Facebook, can quickly and interactively analyze data above 250PB. The development of this project began in the autumn of 20 12. At present, the project has been used by more than 1000 Facebook employees, running more than 30,000 queries, and the daily data is at the level of 1PB. Facebook claims that Presto's performance is 10 times better than Hive and Map*Reduce.

Fourth, sharks.

Sharks are hives on sparks. Essentially, HQL is translated into RDD operation on Spark through HQL parsing of Hive, and then the table information in the database is obtained through the metadata of Hive. The actual data and files about HDFS will be obtained by Shark and run on Spark.

Verb (abbreviation for verb) pig

Introduction: Pig is a programming language, which simplifies the common tasks of Hadoop. Pig can load data, express the converted data and store the final result. Pig's built-in operations make semi-structured data meaningful (such as log files). At the same time, Pig can extend the use of custom data types added in Java and support data conversion.