Facebook launches efficient query engine Presto_PHP tutorial
At a developer meeting at Facebook headquarters, engineers from the social networking giant revealed that they are using a new self-developed query engine, Presto, to conduct interactive analysis on the existing massive 250PB data warehouse.
According to engineer Martin Traverso, more than 850 Facebook engineers use it to scan more than 320TB of data every day. In the past, our scientists and analysts have relied on Hive for data analysis. But Hive is designed for batch processing. But with more and more data, Hive can no longer meet our needs. While we have other tools that are faster than Hive, they are either limited in functionality or too simple to operate our massive data warehouse. And over the past few months, we've been using Presto to fill this gap.
Hive is a data warehouse tool created by Facebook specifically for Hadoop a few years ago. Because it mainly relies on MapReduce for operation, as it ages, its speed can no longer meet the growing data requirements. Browsing through a complete data set could take anywhere from minutes to hours, which is simply impractical.
Traverso also said that simple queries with Presto only take a few hundred milliseconds, and even very complex queries only take minutes to complete. It runs in memory and does not write to disk.
While it may look like Presto is Facebook's version of the Cloudera Impala SQL query engine, or similar to what Hortonworks is doing with Project Stinger, this is a version customized for faster operations at Facebook's scale. Presto won't compete with other commercial products, but it will soon shake up the big data industry. And Facebook plans to release Presto as open source this fall.
Ravi Murthy, engineering manager at Facebook, said that as the number of users continues to grow, the data warehouse is also growing rapidly. It is 4,000 times larger than four years ago. Murthy also said that in the next few years, data will reach exabytes. So in order to accommodate this kind of data scale, we had to rethink a lot of things.
Presto is one of them. In addition to improving query speed, this engine is 7 times more efficient than Hive in terms of CPU usage efficiency. Another ongoing project is shrinking the analytics space in Facebook's data centers.
What do the experts on Weibo think of Presto, the latest query engine launched by Facebook?
Big Data Pi Dong, former head of the Big Data Laboratory of EMC China Research Institute : Facebook’s latest interactive big data query system Presto, similar to Cloudera’s Impala and Hortonworks’ Stinger, solves Facebook’s rapidly expanding massive data warehouse Quickly check requirements. Facebook is developing a new generation of big data system for Exabyte scale data. Presto is one of the data warehouse interactive query systems and should also have a mass storage system. At this level, there's a lot of design to consider!
Sina CTO and Co-President Jack Xu Liangjie: Social networks and social media have given birth to a real big data (Big Data) platform. Sina Weibo is no exception...
vinW, a computer and linguistics researcher at the University of Leeds, UK, and a postdoctoral researcher on the search project: 1. Presto will be open source in the autumn; 2. Seven times faster than hive; 3. Based on memory
Launch_Bruce: FaceBook is not a search engine and has higher requirements for real-time performance. Even if Hive was initially launched, it could only be a temporary measure. This is the gene of Hadoop. Hadoop will definitely make many projects that are launched blindly without in-depth thinking difficult in the end. But obviously Hadoop's successful ecosystem will also harm many people.
TeslaElon: Come on! Big Data will generate many business opportunities. In particular, potential cooperation with Alibaba, the largest e-commerce platform, and YOKU, the largest video platform, are worth looking forward to. In addition, Sina has invested in many popular applications on Weibo and has many opportunities. We will see how Sina does well in R&D, management and sales later.
Henry, who carries big data: We were doing big data analysis about five years ago, and our MPP product already had these strategies. At that time, the biggest problem was big data in the Internet, but these star companies did not like to spend money to buy but only loved to build wheels. It's better for telecom customers, who are willing to spend money to purchase rather than reinvent the wheel.
English from: gigaom.com

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

How to check my academic qualifications on Xuexin.com? You can check your academic qualifications on Xuexin.com, but many users don’t know how to check their academic qualifications on Xuexin.com. Next, the editor brings you a graphic tutorial on how to check your academic qualifications on Xuexin.com. Interested users come and take a look! Xuexin.com usage tutorial: How to check your academic qualifications on Xuexin.com 1. Xuexin.com entrance: https://www.chsi.com.cn/ 2. Website query: Step 1: Click on the Xuexin.com address above to enter the homepage Click [Education Query]; Step 2: On the latest webpage, click [Query] as shown by the arrow in the figure below; Step 3: Then click [Login Academic Credit File] on the new page; Step 4: On the login page Enter the information and click [Login];

Download the latest version of 12306 ticket booking app. It is a travel ticket purchasing software that everyone is very satisfied with. It is very convenient to go wherever you want. There are many ticket sources provided in the software. You only need to pass real-name authentication to purchase tickets online. All users You can easily buy travel tickets and air tickets and enjoy different discounts. You can also start booking reservations in advance to grab tickets. You can book hotels or special car transfers. With it, you can go where you want to go and buy tickets with one click. Traveling is simpler and more convenient, making everyone's travel experience more comfortable. Now the editor details it online Provides 12306 users with a way to view historical ticket purchase records. 1. Open Railway 12306, click My in the lower right corner, and click My Order 2. Click Paid on the order page. 3. On the paid page

C drive space is running out! 5 efficient cleaning methods revealed! In the process of using computers, many users will encounter a situation where the C drive space is running out. Especially after storing or installing a large number of files, the available space of the C drive will decrease rapidly, which will affect the performance and running speed of the computer. At this time, it is very necessary to clean up the C drive. So, how to clean up C drive efficiently? Next, this article will reveal 5 efficient cleaning methods to help you easily solve the problem of C drive space shortage. 1. Clean up temporary files. Temporary files are temporary files generated when the computer is running.

Python and C++ are two popular programming languages, each with its own advantages and disadvantages. For people who want to learn programming, choosing to learn Python or C++ is often an important decision. This article will explore the learning costs of Python and C++ and discuss which language is more worthy of the time and effort. First, let's start with Python. Python is a high-level, interpreted programming language known for its ease of learning, clear code, and concise syntax. Compared to C++, Python

MySQL and PL/SQL are two different database management systems, representing the characteristics of relational databases and procedural languages respectively. This article will compare the similarities and differences between MySQL and PL/SQL, with specific code examples to illustrate. MySQL is a popular relational database management system that uses Structured Query Language (SQL) to manage and operate databases. PL/SQL is a procedural language unique to Oracle database and is used to write database objects such as stored procedures, triggers and functions. same

How to check the latest price of Tongshen Coin? Token is a digital currency that can be used to purchase in-game items, services, and assets. It is decentralized, meaning it is not controlled by governments or financial institutions. Transactions of Tongshen Coin are conducted on the blockchain, which is a distributed ledger that records the information of all Tongshen Coin transactions. To check the latest price of Token, you can use the following steps: Choose a reliable price check website or app. Some commonly used price query websites include: CoinMarketCap: https://coinmarketcap.com/Coindesk: https://www.coindesk.com/ Binance: https://www.bin

The Bitget Exchange offers a variety of login methods, including email, mobile phone number and social media accounts. This article details the latest entrances and steps for each login method, including accessing the official website, selecting the login method, entering the login credentials, and completing the login. Users should pay attention to using the official website when logging in and properly keep the login credentials.

When AI Ideograms compete for realism and artistic sense, Ideogram has opened up a tricky track: it can accurately generate text on pictures, and the fonts and layouts are beautiful. This demand is not niche. Generate posters and illustrations with one click without using P-pictures. It can save a lot of trouble and is very suitable for ordinary people who know nothing about design. We previously wrote about version 1.0 of Ideogram. On August 21st, version 2.0 came. The realism is better, the posters are more designed, and the special skill of text is also stronger. You may have never heard of it. This is an AI product developed by former Google employees. It has many shortcomings, but the longboard can "overtake" Midjourney in corners. Directions https://ideogram.ai/A
