Find count and percentage of students failed in a subject

Find count and percentage of students failed in a subject

I am new to MongoDB and learning to write mongodb queries. I need help for below problem:
Input data is a collection student having five records:
[
{
“_id”:0,
“name”:”Maxy”,
“results”:[
{
“subject”:”maths”,
“score”:1.46
},
{
“subject”:”english”,
“score”:11.78
},
{
“subject”:”history”,
“score”:6.67
}
]
},
{
“_id”:1,
“name”:”Nancy”,
“results”:[
{
“subject”:”maths”,
“score”:60.06
},
{
“subject”:”english”,
“score”:52.79
},
{
“subject”:”history”,
“score”:71.76
}
]
},
{
“_id”:2,
“name”:”Peter”,
“results”:[
{
“subject”:”maths”,
“score”:27.03
},
{
“subject”:”english”,
“score”:6.30
},
{
“subject”:”history”,
“score”:20.18
}
]
},
{
“_id”:3,
“name”:”Harry”,
“results”:[
{
“subject”:”maths”,
“score”:71.64
},
{
“subject”:”english”,
“score”:24.80
},
{
“subject”:”history”,
“score”:1.69
}
]
},
{
“_id”:4,
“name”:”Paxy”,
“results”:[
{
“subject”:”maths”,
“score”:28.68
},
{
“subject”:”english”,
“score”:90.29
},
{
“subject”:”history”,
“score”:34.41
}
]
}
]

I need to get the count of students who failed in “maths” subject, where the passing score is 35.
I tried the below approach:
// Will give count of students who failed in subject:maths
var count = db.student.find({
results : {
$elemMatch: {
subject: “maths”,
score: { $lt: 35 }
}
}
}).count() // ==>> 3

// Total students
var total = db.student.count() // ==>> 5

// Percentage
var percentage = count*100/total // ==>> 3*100/5=60%

Is it possible to get the count and percentage of students failed in “maths” subject in a single query?

Solutions/Answers:

Solution 1:

Here is another option to execute using aggregation on collection:

db.student.aggregate([
    {$unwind:"$results"},
    {$match: {"results.subject": "maths"}},
    {$group: {
        _id: null,
        totalCount: {$sum:1},
        failCount: {$sum: { "$cond": [{ "$lt": ["$results.score", 35 ] }, 1, 0 ] }}}},
    {$project: {
        _id : 0,
        totalCount : 1,
        failCount : 1,
        failPercent: {
            $multiply: [ { $divide: ["$failCount", "$totalCount"] }, 100 ]
        }
    }}
])

Student record is considered for total only if results contain maths result; and one student expected to have one result for maths.

Related:  MongoDB: How to find out if an array field contains an element?

Pipeline operators:

unwind -> get individual rows per subject

match – > consider results for maths only

group -> Run grouping to get total counts; use condition to identify failed count

project -> Project required elements and percentage in result output

{
    "totalCount" : 5,
    "failCount" : 3,
    "failPercent" : 60
}

Solution 2:

You would need to run an aggregate operation with MongoDB’s aggregation framework for a single query. Consider running the following pipeline to get the desired result:

db.student.aggregate([
    {
        "$project": {
            "maths_result": {
                "$arrayElemAt": [
                    {
                        "$filter": {
                            "input": "$results",
                            "as": "res",
                            "cond": {
                                "$eq": ["$$res.subject", "maths"]
                            }
                        }
                    },
                    0
                ]
            }
        }
    },
    {
        "$project": {
            "maths_fail": { "$lt": ["$maths_result.score", 35 ] }
        }
    },
    {
        "$group": {
            "_id": null,
            "total": { "$sum": 1 },
            "fail_count": { "$sum": { "$cond": ["$maths_fail", 1, 0 ] } }
        }
    },
    {
        "$project": {
            "percentage_maths_fail": {
                "$multiply": [
                    { "$divide": ["$fail_count", "$total"] },
                    100
                ]
            }
        }
    }
])

Pipeline Explanation

When executing a pipeline, MongoDB pipes operators into each other. “Pipe” here takes the Linux meaning: the output of an operator becomes the input of the following operator. The result of each operator is a new collection of documents. So Mongo executes the above pipeline as follows:

Related:  Listen for changes in MongoDB
collection | $project | $project | $group | $project => result

The first pipeline step $project, is quite similar to SELECT statement in SQL where in this case it is used to create a new field that has a single element from the results array. This is made possible through the $arrayElemAt and $filter operators. The $filter operator expression will return an array that matches the given condition where the subject field is “maths” and the $arrayElemAt returns the first element in the array, akin to flattening it.

So, running the pipeline with this step only will yield the results:

db.student.aggregate([
    {
        "$project": {
            "maths_result": {
                "$arrayElemAt": [
                    {
                        "$filter": {
                            "input": "$results",
                            "as": "res",
                            "cond": {
                                "$eq": ["$$res.subject", "maths"]
                            }
                        }
                    },
                    0
                ]
            }
        }
    }
])

Sample Output

/* 1 */
{
    "_id" : 0,
    "maths_result" : {
        "subject" : "maths",
        "score" : 1.46
    }
}

/* 2 */
{
    "_id" : 1,
    "maths_result" : {
        "subject" : "maths",
        "score" : 60.06
    }
}

/* 3 */
{
    "_id" : 2,
    "maths_result" : {
        "subject" : "maths",
        "score" : 27.03
    }
}

/* 4 */
{
    "_id" : 3,
    "maths_result" : {
        "subject" : "maths",
        "score" : 71.64
    }
}

/* 5 */
{
    "_id" : 4,
    "maths_result" : {
        "subject" : "maths",
        "score" : 28.68
    }
}

The second step with the next $project operator will reshape the above field so that it returns either true or false for the fail condition:

db.student.aggregate([
    {
        "$project": {
            "maths_result": {
                "$arrayElemAt": [
                    {
                        "$filter": {
                            "input": "$results",
                            "as": "res",
                            "cond": {
                                "$eq": ["$$res.subject", "maths"]
                            }
                        }
                    },
                    0
                ]
            }
        }
    },
    {
        "$project": {
            "maths_fail": { "$lt": ["$maths_result.score", 35 ] }
        }
    }       
])

Sample Output

/* 1 */
{
    "_id" : 0,
    "maths_fail" : true
}

/* 2 */
{
    "_id" : 1,
    "maths_fail" : false
}

/* 3 */
{
    "_id" : 2,
    "maths_fail" : true
}

/* 4 */
{
    "_id" : 3,
    "maths_fail" : false
}

/* 5 */
{
    "_id" : 4,
    "maths_fail" : true
}

The 3rd step, $group, groups all the documents as a whole to aggregate the counts. The total document count is derived by the expression "total": { "$sum": 1 } whilst the conditional sum

"$sum": { "$cond": ["$maths_fail", 1, 0 ] } 

will give you the total number of fails:

db.student.aggregate([
    {
        "$project": {
            "maths_result": {
                "$arrayElemAt": [
                    {
                        "$filter": {
                            "input": "$results",
                            "as": "res",
                            "cond": {
                                "$eq": ["$$res.subject", "maths"]
                            }
                        }
                    },
                    0
                ]
            }
        }
    },
    {
        "$project": {
            "maths_fail": { "$lt": ["$maths_result.score", 35 ] }
        }
    },
    {
        "$group": {
            "_id": null,
            "total": { "$sum": 1 },
            "fail_count": { "$sum": { "$cond": ["$maths_fail", 1, 0 ] } }
        }
    }
])

Sample Output

/* 1 */
{
    "_id" : null,
    "total" : 5,
    "fail_count" : 3
}

The final pipeline will then give you the percentage by using the arithmetic operators $divide and $multiply:

{
    "$project": {
        "percentage_maths_fail": {
            "$multiply": [
                { "$divide": ["$fail_count", "$total"] },
                100
            ]
        }
    }
}

Final Output

/* 1 */
{
    "_id" : null,
    "percentage_maths_fail" : 60
}
Related:  Looking for middleware from mongodb to tableau

References