Home Database Mysql Tutorial 用Twitter的cursor方式进行Web数据分页_MySQL

用Twitter的cursor方式进行Web数据分页_MySQL

Jun 01, 2016 pm 01:49 PM
count cursor twitter technology

bitsCN.com

  本文讨论Web应用中实现数据分页功能,不同的技术实现方式的性能方区别。

/

  上图功能的技术实现方法拿MySQL来举例就是

  select * from msgs where thread_id = ? limit page * count, count

  不过在看Twitter API的时候,我们却发现不少接口使用cursor的方法,而不用page, count这样直观的形式,如 followers ids 接口

 

  URL:

  http://twitter.com/followers/ids.format

  Returns an array of numeric IDs for every user following the specified user.

  Parameters:

  * cursor. Required. Breaks the results into pages. Provide a value of -1 to begin paging. Provide values as returned to in the response body’s next_cursor and previous_cursor attributes to page back and forth in the list.

  o Example: http://twitter.com/followers/ids/barackobama.xml?cursor=-1

  o Example: http://twitter.com/followers/ids/barackobama.xml?cursor=-1300794057949944903

 

  从上面描述可以看到,http://twitter.com/followers/ids.xml 这个调用需要传cursor参数来进行分页,而不是传统的 url?page=n&count=n的形式。这样做有什么优点呢?是否让每个cursor保持一个当时数据集的镜像?防止由于结果集实时改变而产生查询结果有重复内容?

  在Google Groups这篇Cursor Expiration讨论中Twitter的架构师John Kalucki提到

 

  A cursor is an opaque deletion-tolerant index into a Btree keyed by source

  userid and modification time. It brings you to a point in time in the

  reverse chron sorted list. So, since you can’t change the past, other than

  erasing it, it’s effectively stable. (Modifications bubble to the top.) But

  you have to deal with additions at the list head and also block shrinkage

  due to deletions, so your blocks begin to overlap quite a bit as the data

  ages. (If you cache cursors and read much later, you’ll see the first few

  rows of cursor[n+1]’s block as duplicates of the last rows of cursor[n]’s

  block. The intersection cardinality is equal to the number of deletions in

  cursor[n]’s block). Still, there may be value in caching these cursors and

  then heuristically rebalancing them when the overlap proportion crosses some

  threshold.

 

  在另外一篇new cursor-based pagination not multithread-friendly中John又提到

 

  The page based approach does not scale with large sets. We can no

  longer support this kind of API without throwing a painful number of

  503s.

  Working with row-counts forces the data store to recount rows in an O

  (n^2) manner. Cursors avoid this issue by allowing practically

  constant time access to the next block. The cost becomes O(n/

  block_size) which, yes, is O(n), but a graceful one given n

  a block_size of 5000. The cursor approach provides a more complete and

  consistent result set.

  Proportionally, very few users require multiple page fetches with a

  page size of 5,000.

  Also, scraping the social graph repeatedly at high speed is could

  often be considered a low-value, borderline abusive use of the social

  graph API.

 

  通过这两段文字我们已经很清楚了,对于大结果集的数据,使用cursor方式的目的主要是为了极大地提高性能。还是拿MySQL为例说明,比如翻页到100,000条时,不用cursor,对应的SQL为

  select * from msgs limit 100000, 100

  在一个百万记录的表上,第一次执行这条SQL需要5秒以上。

  假定我们使用表的主键的值作为cursor_id, 使用cursor分页方式对应的SQL可以优化为

  select * from msgs where id > cursor_id limit 100;

  同样的表中,通常只需要100ms以下, 效率会提高几十倍。MySQL limit性能差别也可参看我3年前写的一篇不成熟的文章 MySQL LIMIT 的性能问题。

  结论

  建议Web应用中大数据集翻页可以采用这种cursor方式,不过此方法缺点是翻页时必须连续,不能跳页。

bitsCN.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

The Stable Diffusion 3 paper is finally released, and the architectural details are revealed. Will it help to reproduce Sora? The Stable Diffusion 3 paper is finally released, and the architectural details are revealed. Will it help to reproduce Sora? Mar 06, 2024 pm 05:34 PM

StableDiffusion3’s paper is finally here! This model was released two weeks ago and uses the same DiT (DiffusionTransformer) architecture as Sora. It caused quite a stir once it was released. Compared with the previous version, the quality of the images generated by StableDiffusion3 has been significantly improved. It now supports multi-theme prompts, and the text writing effect has also been improved, and garbled characters no longer appear. StabilityAI pointed out that StableDiffusion3 is a series of models with parameter sizes ranging from 800M to 8B. This parameter range means that the model can be run directly on many portable devices, significantly reducing the use of AI

This article is enough for you to read about autonomous driving and trajectory prediction! This article is enough for you to read about autonomous driving and trajectory prediction! Feb 28, 2024 pm 07:20 PM

Trajectory prediction plays an important role in autonomous driving. Autonomous driving trajectory prediction refers to predicting the future driving trajectory of the vehicle by analyzing various data during the vehicle's driving process. As the core module of autonomous driving, the quality of trajectory prediction is crucial to downstream planning control. The trajectory prediction task has a rich technology stack and requires familiarity with autonomous driving dynamic/static perception, high-precision maps, lane lines, neural network architecture (CNN&GNN&Transformer) skills, etc. It is very difficult to get started! Many fans hope to get started with trajectory prediction as soon as possible and avoid pitfalls. Today I will take stock of some common problems and introductory learning methods for trajectory prediction! Introductory related knowledge 1. Are the preview papers in order? A: Look at the survey first, p

DualBEV: significantly surpassing BEVFormer and BEVDet4D, open the book! DualBEV: significantly surpassing BEVFormer and BEVDet4D, open the book! Mar 21, 2024 pm 05:21 PM

This paper explores the problem of accurately detecting objects from different viewing angles (such as perspective and bird's-eye view) in autonomous driving, especially how to effectively transform features from perspective (PV) to bird's-eye view (BEV) space. Transformation is implemented via the Visual Transformation (VT) module. Existing methods are broadly divided into two strategies: 2D to 3D and 3D to 2D conversion. 2D-to-3D methods improve dense 2D features by predicting depth probabilities, but the inherent uncertainty of depth predictions, especially in distant regions, may introduce inaccuracies. While 3D to 2D methods usually use 3D queries to sample 2D features and learn the attention weights of the correspondence between 3D and 2D features through a Transformer, which increases the computational and deployment time.

What are the blockchain data analysis tools? What are the blockchain data analysis tools? Feb 21, 2025 pm 10:24 PM

The rapid development of blockchain technology has brought about the need for reliable and efficient analytical tools. These tools are essential to extract valuable insights from blockchain transactions in order to better understand and capitalize on their potential. This article will explore some of the leading blockchain data analysis tools on the market, including their capabilities, advantages and limitations. By understanding these tools, users can gain the necessary insights to maximize the possibilities of blockchain technology.

Review! Deep model fusion (LLM/basic model/federated learning/fine-tuning, etc.) Review! Deep model fusion (LLM/basic model/federated learning/fine-tuning, etc.) Apr 18, 2024 pm 09:43 PM

In September 23, the paper "DeepModelFusion:ASurvey" was published by the National University of Defense Technology, JD.com and Beijing Institute of Technology. Deep model fusion/merging is an emerging technology that combines the parameters or predictions of multiple deep learning models into a single model. It combines the capabilities of different models to compensate for the biases and errors of individual models for better performance. Deep model fusion on large-scale deep learning models (such as LLM and basic models) faces some challenges, including high computational cost, high-dimensional parameter space, interference between different heterogeneous models, etc. This article divides existing deep model fusion methods into four categories: (1) "Pattern connection", which connects solutions in the weight space through a loss-reducing path to obtain a better initial model fusion

More than just 3D Gaussian! Latest overview of state-of-the-art 3D reconstruction techniques More than just 3D Gaussian! Latest overview of state-of-the-art 3D reconstruction techniques Jun 02, 2024 pm 06:57 PM

Written above & The author’s personal understanding is that image-based 3D reconstruction is a challenging task that involves inferring the 3D shape of an object or scene from a set of input images. Learning-based methods have attracted attention for their ability to directly estimate 3D shapes. This review paper focuses on state-of-the-art 3D reconstruction techniques, including generating novel, unseen views. An overview of recent developments in Gaussian splash methods is provided, including input types, model structures, output representations, and training strategies. Unresolved challenges and future directions are also discussed. Given the rapid progress in this field and the numerous opportunities to enhance 3D reconstruction methods, a thorough examination of the algorithm seems crucial. Therefore, this study provides a comprehensive overview of recent advances in Gaussian scattering. (Swipe your thumb up

Combination of Golang and front-end technology: explore how Golang plays a role in the front-end field Combination of Golang and front-end technology: explore how Golang plays a role in the front-end field Mar 19, 2024 pm 06:15 PM

Combination of Golang and front-end technology: To explore how Golang plays a role in the front-end field, specific code examples are needed. With the rapid development of the Internet and mobile applications, front-end technology has become increasingly important. In this field, Golang, as a powerful back-end programming language, can also play an important role. This article will explore how Golang is combined with front-end technology and demonstrate its potential in the front-end field through specific code examples. The role of Golang in the front-end field is as an efficient, concise and easy-to-learn

Revolutionary GPT-4o: Reshaping the human-computer interaction experience Revolutionary GPT-4o: Reshaping the human-computer interaction experience Jun 07, 2024 pm 09:02 PM

The GPT-4o model released by OpenAI is undoubtedly a huge breakthrough, especially in its ability to process multiple input media (text, audio, images) and generate corresponding output. This ability makes human-computer interaction more natural and intuitive, greatly improving the practicality and usability of AI. Several key highlights of GPT-4o include: high scalability, multimedia input and output, further improvements in natural language understanding capabilities, etc. 1. Cross-media input/output: GPT-4o+ can accept any combination of text, audio, and images as input and directly generate output from these media. This breaks the limitation of traditional AI models that only process a single input type, making human-computer interaction more flexible and diverse. This innovation helps power smart assistants

See all articles