Home Database Mysql Tutorial [置顶] 如何在Mongodb集合中统计去重之后的数据

[置顶] 如何在Mongodb集合中统计去重之后的数据

Jun 07, 2016 pm 02:50 PM
mongodb Remove duplicates data statistics pin to top gather

比方说我们有个Mongodb集合, 以这个简单的集合为例,我们需要集合中包含多少不同的手机号码,首先想到的应该就是使用distinct关键字, db.tokencaller.distinct('Caller').length 如果想查看具体的而不同的手机号码,那么可以省略后面的length属性,因为 db

比方说我们有个Mongodb集合,

以这个简单的集合为例,我们需要集合中包含多少不同的手机号码,首先想到的应该就是使用distinct关键字,
db.tokencaller.distinct('Caller').length
如果想查看具体的而不同的手机号码,那么可以省略后面的length属性,因为db.tokencaller.distinct('Caller')返回的是由所有去重手机号码组成的数组。


但是,这种方式对于所有情况都是满足的嘛?并不如此,如果要统计的集合记录数较大,如千万级别的,那么在这么统计的时候往往会报10044错误,提示信息“exception : distinct too big , 16mb cap”. 后面我们将通过其他方式进行解决。
另外一种方式可以使用runCommand结合distinct进行使用,
db.runCommand({"distinct":"tokencaller","key":"Caller"})


可见在values上显示了去重之后的手机号码,,看结果是一个Json格式的,于是尝试了下看看能不能取出values的大小,因为如果对于大数据量的集合来说,直接显示去重的号码明显不合适,于是尝试了下面的写法:


发现是可以的,于是对大数据量使用了这种方式看看是否能取出结果,发现不存在length属性,想了想应该跟mongodb的客户端版本有关系吧,还待验证!!!
两种方式都不行,于是试了下mapReduce方式,具体如下:


然后我们会发现,他会将查询出来的结果输出到一个称为“callerstatis”的结合,如下所示:


然后使用db.callerstatis.count()就可以知道有多少不同的手机号码了。
使用这种方式,我们同样在大数据量的集合上试了一下,可惜还是失败了!!!!(桑心T_T),如果有谁有好的方法,麻烦也告诉我一下,小的感激不尽啊^_^
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

The U.S. Air Force showcases its first AI fighter jet with high profile! The minister personally conducted the test drive without interfering during the whole process, and 100,000 lines of code were tested for 21 times. The U.S. Air Force showcases its first AI fighter jet with high profile! The minister personally conducted the test drive without interfering during the whole process, and 100,000 lines of code were tested for 21 times. May 07, 2024 pm 05:00 PM

Recently, the military circle has been overwhelmed by the news: US military fighter jets can now complete fully automatic air combat using AI. Yes, just recently, the US military’s AI fighter jet was made public for the first time and the mystery was unveiled. The full name of this fighter is the Variable Stability Simulator Test Aircraft (VISTA). It was personally flown by the Secretary of the US Air Force to simulate a one-on-one air battle. On May 2, U.S. Air Force Secretary Frank Kendall took off in an X-62AVISTA at Edwards Air Force Base. Note that during the one-hour flight, all flight actions were completed autonomously by AI! Kendall said - "For the past few decades, we have been thinking about the unlimited potential of autonomous air-to-air combat, but it has always seemed out of reach." However now,

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

AI startups collectively switched jobs to OpenAI, and the security team regrouped after Ilya left! AI startups collectively switched jobs to OpenAI, and the security team regrouped after Ilya left! Jun 08, 2024 pm 01:00 PM

Last week, amid the internal wave of resignations and external criticism, OpenAI was plagued by internal and external troubles: - The infringement of the widow sister sparked global heated discussions - Employees signing "overlord clauses" were exposed one after another - Netizens listed Ultraman's "seven deadly sins" Rumors refuting: According to leaked information and documents obtained by Vox, OpenAI’s senior leadership, including Altman, was well aware of these equity recovery provisions and signed off on them. In addition, there is a serious and urgent issue facing OpenAI - AI safety. The recent departures of five security-related employees, including two of its most prominent employees, and the dissolution of the "Super Alignment" team have once again put OpenAI's security issues in the spotlight. Fortune magazine reported that OpenA

70B model generates 1,000 tokens in seconds, code rewriting surpasses GPT-4o, from the Cursor team, a code artifact invested by OpenAI 70B model generates 1,000 tokens in seconds, code rewriting surpasses GPT-4o, from the Cursor team, a code artifact invested by OpenAI Jun 13, 2024 pm 03:47 PM

70B model, 1000 tokens can be generated in seconds, which translates into nearly 4000 characters! The researchers fine-tuned Llama3 and introduced an acceleration algorithm. Compared with the native version, the speed is 13 times faster! Not only is it fast, its performance on code rewriting tasks even surpasses GPT-4o. This achievement comes from anysphere, the team behind the popular AI programming artifact Cursor, and OpenAI also participated in the investment. You must know that on Groq, a well-known fast inference acceleration framework, the inference speed of 70BLlama3 is only more than 300 tokens per second. With the speed of Cursor, it can be said that it achieves near-instant complete code file editing. Some people call it a good guy, if you put Curs

58 lines of code scale Llama 3 to 1 million contexts, any fine-tuned version is applicable 58 lines of code scale Llama 3 to 1 million contexts, any fine-tuned version is applicable May 06, 2024 pm 06:10 PM

Llama3, the majestic king of open source, the original context window is only... 8k, which makes me swallow back the words "it smells so good". Today, when 32k is the starting point and 100k is common, is this intentional to leave room for contributions to the open source community? The open source community certainly didn't miss this opportunity: now with just 58 lines of code, any fine-tuned version of Llama370b can automatically scale to 1048k (one million) contexts. Behind the scenes is a LoRA, extracted from a fine-tuned version of Llama370BInstruct that extends good context, and the file is only 800mb. Next, using Mergekit, you can run it with other models of the same architecture or merge it directly into the model. 1048k context used

China Mobile: Humanity is entering the fourth industrial revolution and officially announced 'three plans” China Mobile: Humanity is entering the fourth industrial revolution and officially announced 'three plans” Jun 27, 2024 am 10:29 AM

According to news on June 26, at the opening ceremony of the 2024 World Mobile Communications Conference Shanghai (MWC Shanghai), China Mobile Chairman Yang Jie delivered a speech. He said that currently, human society is entering the fourth industrial revolution, which is dominated by information and deeply integrated with information and energy, that is, the "digital intelligence revolution", and the formation of new productive forces is accelerating. Yang Jie believes that from the "mechanization revolution" driven by steam engines, to the "electrification revolution" driven by electricity, internal combustion engines, etc., to the "information revolution" driven by computers and the Internet, each round of industrial revolution is based on "information and "Energy" is the main line, bringing productivity development

An American professor used his 2-year-old daughter to train an AI model to appear in Science! Human cubs use head-mounted cameras to train new AI An American professor used his 2-year-old daughter to train an AI model to appear in Science! Human cubs use head-mounted cameras to train new AI Jun 03, 2024 am 10:08 AM

Unbelievably, in order to train an AI model, a professor from the State University of New York strapped a GoPro-like camera to his daughter’s head! Although it sounds incredible, this professor's behavior is actually well-founded. To train the complex neural network behind LLM, massive data is required. Is our current LLM training process necessarily the simplest and most efficient way? Certainly not! Scientists have discovered that in human toddlers, the brain absorbs water like a sponge, quickly forming a coherent worldview. Although LLM performs amazingly at times, over time, human children become smarter and more creative than the model! The secret of children mastering language. How to train LLM in a better way? When scientists are puzzled by the solution,

What is the use of net4.0 What is the use of net4.0 May 10, 2024 am 01:09 AM

.NET 4.0 is used to create a variety of applications and it provides application developers with rich features including: object-oriented programming, flexibility, powerful architecture, cloud computing integration, performance optimization, extensive libraries, security, Scalability, data access, and mobile development support.

See all articles