Getting the number of unique values of a query

I have some documents with the following structure:

{
    "_id": "53ad76d70ddd13e015c0aed1",
    "action": "login",
    "actor": {
        "name": "John",
        "id": 21337037
    }
}

How can I make a query in Node.js that will return the number of the unique actors that have done a specific action. For example if I have a activity stream log, that shows all the actions done by the actors, and a actorscan make a specific action multiple times, how can I get the number of all the unique actors that have done the "login" action. The actors are identified by actor.id

Answers


db.collection.distinct()

db.collection.distinct("actor.id", { action: "login"})

will return all unique occiriences and then you can get count of a result set.

PS do not forget about db.collection.ensureIndex({action: 1})


You can use aggregation framework for this:

db.coll.aggregate([
    /* Filter only actions you're looking for */
    { $match : { action : "login" }},
    /* finally group the documents by actors to calculate the num. of actions */
    { $group : { _id : "$actor", numActions: { $sum : 1 }}}
]);

This query will group the documents by the entire actor sub-document and calculate the number of actions by using $sum. The $match operator will filter only documents with specific action.

However, that query will work only if your actor sub-documents are the same. You said that you're identifying your actors by id field. So if, for some reason, actor sub-documents are not exactly the same, you will have problems with your results.

Consider these these three documents:

{
    ...
    "actor": {
        "name": "John",
        "id": 21337037
    }
},
{
    ...
    "actor": {
        "name": "john",
        "id": 21337037
    }
},
{
    ...
    "actor": {
        "surname" : "Nash",
        "name": "John",
        "id": 21337037           
    }
}

They will be grouped in three different groups, even though the id field is the same. To overcome this problem, you will need to group by actor.id.

db.coll.aggregate([
    /* Filter only actions you're looking for */
    { $match : { action : "login" }},
    /* finally group the documents to calculate the num. of actions */
    { $group : { _id : "$actor.id", numActions: { $sum : 1 }}}
]);

This query will correctly group your documents by looking only at the actor.id field.

Edit

You didn't specify what driver you were using so I wrote the examples for MongoDB shell.

Aggregation with Node.js driver is very similar but with one difference: Node.js is async The results of the aggregation are returned in the callback. You can check the Node.js aggregation documentation for more examples:

So the aggregation command in Node.js will look like this:

var MongoClient = require('mongodb').MongoClient;
MongoClient.connect('mongodb://127.0.0.1:27017/test', function(err, db) {
    if(err) throw err;

    var collection = db.collection('auditlogs');

    collection.aggregate([ 
        { $match : { action : "login" }}, 
        { $group : { _id : "$actor.id", numActions: { $sum : 1 }}} ],
        function(err, docs) {
            if (err) console.error(err);
            console.log(docs);
            // do something with results
        }
    ); 
});

For these test documents:

{
    "_id" : ObjectId("53b162ea698171cc1677fab8"),
    "action" : "login",
    "actor" : {
        "name" : "John",
        "id" : 21337037
    }
},
{
    "_id" : ObjectId("53b162ee698171cc1677fab9"),
    "action" : "login",
    "actor" : {
        "name" : "john",
        "id" : 21337037
    }
},
{
    "_id" : ObjectId("53b162f7698171cc1677faba"),
    "action" : "login",
    "actor" : {
        "name" : "john",
        "surname" : "nash",
        "id" : 21337037
    }
},
{
    "_id" : ObjectId("53b16319698171cc1677fabb"),
    "action" : "login",
    "actor" : {
        "name" : "foo",
        "id" : 10000
    }
}

It will return the following result:

[ { _id: 10000, numActions: 1 },
  { _id: 21337037, numActions: 3 } ]

The aggregation framework is your answer:

db.actors.aggregate([
    // If you really need to filter
    { "$match": { "action": "login" } },
    // Then group
    { "$group": {
        "_id": {
            "action": "$action",
            "actor": "$actor"
        },
        "count": { "$sum": 1 }
    }}
])

Your "actor" combination is "unique", so all you need to do it have the common "grouping keys" under the _id value for the $group pipeline stage and count those "distinct" combinations with $sum.


Need Your Help

How can I stop my integers from displaying as HEX?

c++ integer hex

I am practicing some code implementing different data structures. For this example I am trying to implement a simple stack data structure. So far it works as intended, but I keep getting Hex charac...

mapping the equivalent of a JPA `@Embeddable` type in Slick

slick

Is there a Slick technique for mapping the equivalent of a JPA @Embeddable class?

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.