![Trending Articles on Technical and Non Technical topics](/images/trending_categories.jpeg)
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find all duplicate documents in a MongoDB collection by a key field?
Use the aggregate framework to find all duplicate documents in a MongoDB collection by a key field.
To understand the concept, let us create a collection with the document. The query to create a collection with a document is as follows −
> db.findDuplicateByKeyDemo.insertOne({"StudentId":1,"StudentName":"John"}); { "acknowledged" : true, "insertedId" : ObjectId("5c7f5b168d10a061296a3c3a") } > db.findDuplicateByKeyDemo.insertOne({"StudentId":2,"StudentName":"Carol"}); { "acknowledged" : true, "insertedId" : ObjectId("5c7f5b1f8d10a061296a3c3b") } > db.findDuplicateByKeyDemo.insertOne({"StudentId":3,"StudentName":"Carol"}); { "acknowledged" : true, "insertedId" : ObjectId("5c7f5b248d10a061296a3c3c") } > db.findDuplicateByKeyDemo.insertOne({"StudentId":4,"StudentName":"John"}); { "acknowledged" : true, "insertedId" : ObjectId("5c7f5b2d8d10a061296a3c3d") } > db.findDuplicateByKeyDemo.insertOne({"StudentId":5,"StudentName":"Sam"}); { "acknowledged" : true, "insertedId" : ObjectId("5c7f5b398d10a061296a3c3e") } > db.findDuplicateByKeyDemo.insertOne({"StudentId":6,"StudentName":"Carol"}); { "acknowledged" : true, "insertedId" : ObjectId("5c7f5b438d10a061296a3c3f") }
Display all documents from a collection with the help of find() method. The query is as follows −
> db.findDuplicateByKeyDemo.find().pretty();
The following is the output −
{ "_id" : ObjectId("5c7f5b168d10a061296a3c3a"), "StudentId" : 1, "StudentName" : "John" } { "_id" : ObjectId("5c7f5b1f8d10a061296a3c3b"), "StudentId" : 2, "StudentName" : "Carol" } { "_id" : ObjectId("5c7f5b248d10a061296a3c3c"), "StudentId" : 3, "StudentName" : "Carol" } { "_id" : ObjectId("5c7f5b2d8d10a061296a3c3d"), "StudentId" : 4, "StudentName" : "John" } { "_id" : ObjectId("5c7f5b398d10a061296a3c3e"), "StudentId" : 5, "StudentName" : "Sam" } { "_id" : ObjectId("5c7f5b438d10a061296a3c3f"), "StudentId" : 6, "StudentName" : "Carol" }
Here is the query to find all duplicate documents” −
> db.findDuplicateByKeyDemo.aggregate([ ... { $group: { ... _id: { StudentName: "$StudentName" }, ... UIDS: { $addToSet: "$_id" }, ... COUNTER: { $sum: 1 } ... } }, ... { $match: { ... COUNTER: { $gte: 2 } ... } }, ... { $sort : { COUNTER : -1} }, ... { $limit : 10 } ... ]).pretty();
The following is the output displaying the duplicate records. Here, the student „Carol‟ comes 3 times, whereas John 2 times −
The following is the output −
{ "_id" : { "StudentName" : "Carol" }, "UIDS" : [ ObjectId("5c7f5b248d10a061296a3c3c"), ObjectId("5c7f5b438d10a061296a3c3f"), ObjectId("5c7f5b1f8d10a061296a3c3b") ], "COUNTER" : 3 } { "_id" : { "StudentName" : "John" }, "UIDS" : [ ObjectId("5c7f5b2d8d10a061296a3c3d"), ObjectId("5c7f5b168d10a061296a3c3a") ], "COUNTER" : 2 }
Advertisements