Sunday 25 June 2017

Checking a number is product of two prime numbers

If  m and n are positive integer and n is a divisor or factor of m  then there exists a non zero integer p such that :

n * p = m where (n , p , m  > 0) and n | m (Usual notation to show n is divisor of m)

Case : We need to find a if a number is multiplier of two prime numbers  so we need to find n and p such that n * p = m

E.g ,

Input =35  then output = true   [ 35 = 7 * 5 (both are prime numbers)]

Input =21  then output = true  [21 = 7 * 3 ]

Input =20  then output = false [20 = 2 * 10 or 4 *5   not prime numbers]

Code snippet : 

public static boolean checkMultiplierOfPrime(int n){
if( n< 6){
return false;  ///as least number satifying the condition is 6 = 2 * 3
}
int k = (int) Math.sqrt(n);  /////important catch here to find the square root
for (int i = k; i >= 2; i--) {
if(n % i == 0){
int j = n/i;
if(isPrimeNumber(i) && isPrimeNumber(j)){
return true;
}
}
}
return false;
}

 public static boolean isPrimeNumber(int number){
    if(number<=1){
    return false;
    }
    for(int i=2; i<=number/2; i++){
    if(number % i == 0){
    return false;
    }
    }
    return true;
    }


Time Complexity : O(n) for a number of n  , iteration would from 2 to n/2 , hence complexity would be O(n/2) means O(n)

Enjoy the coding !!!!!!
       

Sunday 4 June 2017

Aggregation Framework - finding & removing the duplicate elements in MongoDB

 Given a collection  name product having following documents in the collection.


{ "_id" : NumberInt("1"),"account" : "abc","vendor" : "amazon","category" : "mobile"},
{"_id" : NumberInt("2"), "account" : "abc","vendor" : "overstock","category" : "mobile"},
{"_id" : NumberInt("3"),"account" : "xyz","vendor" : "fashionStreet","category" : "fashion"},
{"_id" : NumberInt("4"),"account" : "pqr","vendor" : "foodTruck","category" : "food"},
{"_id" : NumberInt("5"),"account" : "abc","vendor" : "amazon", "category" : "mobile"}

If one notice there are duplicate elements exist in the collection with the composite key(account+ vendor+ category) marked in blue on colour , so how do we identify the duplicates in a collection and how do we remove the duplicate keys in the MongoDB?

This problem is easily solved using Aggregation pipeline framework which provides  $group operator to group an compound or composite key along with the $match operator to match a criteria.

db.product.aggregate([

{ $group: { 

_id: { account: "$account", vendor: "$vendor", category: "$category" }, 
count: { $sum: 1 } 
}}, 
{
    $match: {
        count: { $gt: 1 }
    }
}, 
{
 $sort: { 
    count : -1 }
 }
]);

Above query find all the duplicates elements whose account, vendor , category  values are same and having duplicate values and output of the above query is as follow  with the descending order of count : 

Output

{
"_id" : {
"account" : "abc",
"vendor" : "amazon",
"category" : "mobile"
},
"count" : 2
}

Note that _id is made composite of  3 columns account , vendor , category so in general 

_id :  { key1 : "$key1",  key2 : "$key2", key3 : "$key3", key4 : "$key4", key5 : "$key5" }

and count plays an important role to find out the number of duplicates in the collection.

With the above example , we have identified the duplicate elements in the collection , so the next question arises , how to remove duplicate keys in a collection in MongoDB?

var duplicates = [];
db.product.aggregate([
  { $group: { 
    _id: { account: "$account" , vendor : "$vendor", category : "$category"}, // can be grouped on multiple properties/keys/columns 
    dups: { "$addToSet": "$_id" }, 
    count: { "$sum": 1 } 
  }}, 
  { $match: { 
    count: { "$gt": 1 }    // Duplicates considered as count greater than one
  }}
])             
.forEach(function(doc) {
    doc.dups.shift();      // First element skipped for deleting and important line here
    doc.dups.forEach( function(dupId){ 
        duplicates.push(dupId);   // Getting all duplicate ids
        }
    )    
});

// If you want to Check all "_id" which you are deleting
printjson(duplicates); 

Output : 

[ NumberInt("1") ]



Now we have identified the array of duplicate "_id"   as shown in the output of the above query.
To remove all the duplicates from the collection  , need to execute the following 

// Remove all duplicates in one go    
db.product.remove({_id:{$in:duplicates}});

db.product.find({});

Output

{"_id" : NumberInt("2"), "account" : "abc","vendor" : "overstock","category" : "mobile"},
{"_id" : NumberInt("3"),"account" : "xyz","vendor" : "fashionStreet","category" : "fashion"},
{"_id" : NumberInt("4"),"account" : "pqr","vendor" : "foodTruck","category" : "food"},
{"_id" : NumberInt("5"),"account" : "abc","vendor" : "amazon", "category" : "mobile"}


So you will get an amazing result with the all the records are unique in the collection.

Be alert while applying an index on the composite columns with {unique: true, dropDups: true}). 

Suppose in the above scenario we have applied these condition then it will throw an error as 


{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "errmsg" : "exception: E11000 duplicate key error index
    "code" : 11000,
    "ok" : 0
}
Note that dropDups has been removed in the Mongo 3.x release and has a severe disadvantage of removing the any random duplicate element.

Even if you apply ensureIndex() or createIndex()  at the time of collection having duplicate keys then it will prompt   exception "exception: E11000 duplicate key error index"  , so its important to remove the duplicate elements with the above mentioned aggregation example and  then applying the {unique: true}  on the composite index created with the fields say here {account, vendor , category}.


Enjoy the MongoDB  and the brilliant  Aggregation Framework !!!!

Enjoy  the coding with tek9g!!!