Home > Technology peripherals > It Industry > Using JOINs in MongoDB NoSQL Databases

Using JOINs in MongoDB NoSQL Databases

William Shakespeare
Release: 2025-02-18 12:59:14
Original
454 people have browsed it

Using JOINs in MongoDB NoSQL Databases

Core points

  • MongoDB, a NoSQL database, introduced a new $lookup operator in version 3.2, which can perform LEFT-OUTER-JOIN-like operations on two or more sets, thereby achieving data similar to relational databases. manage. However, this operator is limited to use in aggregate operations, which are more complex and usually slower than simple lookup queries.
  • MongoDB's $lookup operator requires four parameters: localField (input the search field in the document), from (the collection to be connected), foreignField (fields to be found in the collection of from ) and as (name of the output field). This operator can be used in aggregate queries to match posts, sort in order, limit the number of items, connect user data, flatten user arrays and return only necessary fields.
  • Although MongoDB's $lookup operator is useful and can help manage a small amount of relational data in a NoSQL database, it is not a replacement for the more powerful JOIN clause in SQL. If the user document is deleted in MongoDB, the orphan post document will be retained, indicating a lack of constraints. Therefore, if the $lookup operator is frequently used, it may indicate that the wrong data storage is used, and a relational (SQL) database may be more suitable.

Using JOINs in MongoDB NoSQL Databases Thanks to Julian Motz for his peer review help.


One of the biggest differences between SQL and NoSQL databases is JOIN. In a relational database, the SQL JOIN clause allows you to combine rows from two or more tables using a common field between them. For example, if you have a book and publisher table, you can write the following SQL command:

SELECT book.title, publisher.name
FROM book
LEFT JOIN book.publisher_id ON publisher.id;
Copy after login
Copy after login
Copy after login

In other words, the book table has a publisher_id field that references the id field in the publisher table.

This is practical because a single publisher can provide thousands of books. If we need to update the publisher's details in the future, we can change the individual record. Data redundancy is minimized because we do not need to repeat the publisher's information for each book. This technology is called standardization.

SQL databases provide a range of standardization and constraints to ensure the maintenance of relationships.

NoSQL == No JOIN?

This is not always the case…

Document-oriented databases (such as MongoDB) are designed to store de-normalized data. Ideally, there shouldn't be any relationship between sets. If the same data needs to be in two or more documents, it must be repeated.

This can be frustrating because there is almost no situation where you will never need relational data. Fortunately, MongoDB 3.2 introduced a new operator that can perform LEFT-OUTER-JOIN-like operations on two or more sets. But there is a problem...$lookup

MongoDB Aggregation

$lookup Only allowed to be used in aggregate operations. Think of it as a pipeline of a series of operators that query, filter and group results. The output of one operator is used as input to the next operator.

Aggregations are harder to understand than simple lookup queries and usually run slower. However, they are powerful and are a valuable option for complex search operations.

It is best to use an example to explain the aggregation. Suppose we are creating a social media platform with a collection of users. It stores the details of each user in a separate document. For example:

SELECT book.title, publisher.name
FROM book
LEFT JOIN book.publisher_id ON publisher.id;
Copy after login
Copy after login
Copy after login

We can add as many fields as we want, but all MongoDB documents require a _id field with a unique value. _id Similar to SQL primary keys, they will be inserted automatically if needed.

Our social network now needs a collection of posts that store a large number of insightful updates from users. The document stores text, date, rating and references to the user who wrote it in the user_id field:

{
  "_id": ObjectID("45b83bda421238c76f5c1969"),
  "name": "User One",
  "email": "userone@email.com",
  "country": "UK",
  "dob": ISODate("1999-09-13T00:00:00.000Z")
}
Copy after login
Copy after login

We now want to display the last twenty posts rated "important" by all users in reverse order of time. Each returned document should contain text, the time of the post, and the name and country of the associated user.

MongoDB aggregation query passes an array of pipeline operators that define each operation in order. First, we need to use the $match filter to extract all documents with correct ratings from the post collection:

{
  "_id": ObjectID("17c9812acff9ac0bba018cc1"),
  "user_id": ObjectID("45b83bda421238c76f5c1969"),
  "date": ISODate("2016-09-05T03:05:00.123Z"),
  "text": "My life story so far",
  "rating": "important"
}
Copy after login
Copy after login

We now have to sort the matching items in reverse order by using the $sort operator:

{ "$match": { "rating": "important" } }
Copy after login

Since we only need twenty posts, we can apply the $limit stage so that MongoDB only needs to process the data we want:

{ "$sort": { "date": -1 } }
Copy after login

We can now use the new $lookup operator to connect data from the user collection. It requires an object with four parameters:

  • localField: Enter the search field in the document
  • from: Collection to be connected
  • foreignField: Fields found in from collection
  • as: The name of the output field.

Therefore, our operator is:

{ "$limit": 20 }
Copy after login

This will create a new field in our output called userinfo. It contains an array where each value matches the user document:

{ "$lookup": {
  "localField": "user_id",
  "from": "user",
  "foreignField": "_id",
  "as": "userinfo"
} }
Copy after login

We have a one-to-one relationshippost.user_id and user._id because a post can only have one author. Therefore, our userinfo array will always contain only one item. We can use the $unwind operator to break it down into a subdocument:

"userinfo": [
  { "name": "User One", ... }
]
Copy after login

Output will now be converted to a more practical format, with other operators available for applying:

{ "$unwind": "$userinfo" }
Copy after login

Finally, we can use the $project stage in the pipeline to return text, time of post, user's name and country:

SELECT book.title, publisher.name
FROM book
LEFT JOIN book.publisher_id ON publisher.id;
Copy after login
Copy after login
Copy after login

Put everything together

Our final aggregate query matches posts, sorts in order, limits to the latest twenty items, connects user data, flattens user arrays and returns only necessary fields. Complete command:

{
  "_id": ObjectID("45b83bda421238c76f5c1969"),
  "name": "User One",
  "email": "userone@email.com",
  "country": "UK",
  "dob": ISODate("1999-09-13T00:00:00.000Z")
}
Copy after login
Copy after login

The result is a collection of up to twenty documents. For example:

{
  "_id": ObjectID("17c9812acff9ac0bba018cc1"),
  "user_id": ObjectID("45b83bda421238c76f5c1969"),
  "date": ISODate("2016-09-05T03:05:00.123Z"),
  "text": "My life story so far",
  "rating": "important"
}
Copy after login
Copy after login

Great! I can finally switch to NoSQL!

MongoDB $lookup is useful and powerful, but even this basic example requires a complex aggregation query. It cannot replace the more powerful JOIN clause in SQL. MongoDB also does not provide constraints; if the user document is deleted, the orphan post document will be retained.

Ideally, the $lookup operator should be rarely needed. If you need it frequently, you may have used the wrong data store...

If you have relational data, please use a relational (SQL) database!

That is, $lookup is a popular addition to MongoDB 3.2. It overcomes some of the more frustrating problems when using a small amount of relational data in a NoSQL database.

FAQs about using JOIN in MongoDB NoSQL databases (FAQ)

What is the difference between SQL connection and MongoDB connection?

In a SQL database, the connection operation combines rows from two or more tables based on the columns associated between them. However, MongoDB, as a NoSQL database, does not support traditional SQL connections. Instead, MongoDB provides two ways to perform similar operations: the $lookup stage and the $graphLookup stage in the aggregation. These methods allow you to combine data from multiple collections into a single result set.

How does the

stage in MongoDB work? $lookup The

The

stage in MongoDB allows you to connect documents from another collection ("connected" collection) and add the connected documents to the input document. The $lookup phase specifies the "from" collection, "localField" and "foreignField" to match the document, and the "as" field to output the document. It is similar to the left outer join in SQL, returning all documents from the input collection and matching documents from the "from" collection. $lookup

Can I perform recursive search using MongoDB connection?

Yes, MongoDB provides a

phase for recursive search. The $graphLookup stage performs a recursive search on the specified set and can choose to limit the depth and breadth of the search. It is useful for querying hierarchical data or graphs where the number of levels is unknown or may change. $graphLookup

How to optimize performance when using MongoDB connection?

To optimize performance when using MongoDB connections, consider the following strategies: use indexes for "localField" and "foreignField" to speed up the matching process; limit the number of documents in the "from" collection; and use before the

stage $lookup and $match stages to filter and convert documents. $project

Can I connect multiple collections in MongoDB?

Yes, you can connect multiple MongoDB collections by linking multiple $lookup stages in an aggregation pipeline. Each $lookup stage adds connected documents from another collection to the input document.

How to deal with null or missing values ​​when using MongoDB connection?

When using MongoDB connection, if the document in the input collection does not match any document in the "from" collection, the $lookup phase adds an empty array to the "as" field. You can handle these null or missing values ​​by adding the $lookup phase after the $match phase to filter out documents with empty "as" fields.

Can I use MongoDB connection with sharded collections?

Starting from MongoDB 3.6, the $lookup and $graphLookup stages can accept sharded sets to be "from" sets. However, due to the additional network overhead, performance may not be as good as non-shaved collections.

How to sort connected documents in MongoDB?

You can sort connected documents in MongoDB by adding the $lookup phase after the $sort phase in the aggregation pipeline. The $sort stage sorts the documents in the specified field in ascending or descending order.

Can I use MongoDB connection with find() method?

No, MongoDB connection cannot be used with find() method. The $lookup and $graphLookup stages are part of the aggregation framework that provides more advanced data processing capabilities than the find() method.

How to debug or troubleshoot MongoDB connection failure?

To debug or troubleshoot MongoDB connection failures, you can use the explain() method to analyze the execution plan of the aggregate pipeline. The explain() method provides detailed information about the stage, including the number of documents processed, the time spent, and the usage of the index.

The above is the detailed content of Using JOINs in MongoDB NoSQL Databases. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template