超大数据分页之Twitter的cursor方式进行Web数据分页
文章介绍了关于Twitter的cursor方式进行Web数据分页用法,有需要了解大数据分页的朋友可参考一下。
上图功能的技术实现方法拿MySQL来举例就是
select * from msgs where thread_id = ? limit page * , count
不过在看Twitter API的时候,我们却发现不少使用cursor的方法,而不用page, count这样直观的形式,如 followers ids 接口
代码如下 | 复制代码 |
URL: http://twitter.com/followers/ids.format Returns an of numeric IDs for every user following the specified user. Parameters: |
http://twitter.com/followers/ids.format
从上面描述可以看到,http://twitter.com/followers/ids.xml 这个调用需要传cursor参数来进行分页,而不是传统的 url?page=n&count=n的形式。这样做有什么优点呢?是否让每个cursor保持一个当时数据集的镜像?防止由于结果集实时改变而产生查询结果有重复内容?
在 Groups这篇Cursor Expiration讨论中Twitter的架构师John Kalucki提到
代码如下 | 复制代码 |
A cursor is an opaque deletion-tolerant index into a Btree keyed by source |
在另外一篇new cursor-based pagination not multithread-friendly中John又提到
代码如下 | 复制代码 |
The page based approach does not scale with large sets. We can no Working with row-counts forces the data store to recount rows in an O Proportionally, very few users require multiple page fetches with a Also, scraping the social graph repeatedly at high speed is could |
通过这两段文字我们已经很清楚了,对于大结果集的数据,使用cursor方式的目的主要是为了极大地提高性能。还是拿MySQL为例说明,比如翻页到100,000条时,不用cursor,对应的SQL为
select * from msgs limit 100000, 100
在一个百万记录的表上,第一次执行这条SQL需要5秒以上。
假定我们使用表的主键的值作为cursor_id, 使用cursor分页方式对应的SQL可以优化为
select * from msgs where id > cursor_id limit 100;

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

This article addresses MySQL's "unable to open shared library" error. The issue stems from MySQL's inability to locate necessary shared libraries (.so/.dll files). Solutions involve verifying library installation via the system's package m

This article explores optimizing MySQL memory usage in Docker. It discusses monitoring techniques (Docker stats, Performance Schema, external tools) and configuration strategies. These include Docker memory limits, swapping, and cgroups, alongside

The article discusses using MySQL's ALTER TABLE statement to modify tables, including adding/dropping columns, renaming tables/columns, and changing column data types.

This article compares installing MySQL on Linux directly versus using Podman containers, with/without phpMyAdmin. It details installation steps for each method, emphasizing Podman's advantages in isolation, portability, and reproducibility, but also

This article provides a comprehensive overview of SQLite, a self-contained, serverless relational database. It details SQLite's advantages (simplicity, portability, ease of use) and disadvantages (concurrency limitations, scalability challenges). C

Article discusses configuring SSL/TLS encryption for MySQL, including certificate generation and verification. Main issue is using self-signed certificates' security implications.[Character count: 159]

This guide demonstrates installing and managing multiple MySQL versions on macOS using Homebrew. It emphasizes using Homebrew to isolate installations, preventing conflicts. The article details installation, starting/stopping services, and best pra

Article discusses popular MySQL GUI tools like MySQL Workbench and phpMyAdmin, comparing their features and suitability for beginners and advanced users.[159 characters]
