Simplifying your back end using MongoDB as queue repository and persistent database

July 31, 2017

Marcel Balassiano SQL Server and MongoDB consulting

 

In this post, I intend to show a back-end architecture implemented for one of our clients. This architecture uses MongoDB as permanent database and queue repository. Is is important to know that MongoDB is not the best database to use as a repository as it is limited by amount of inserts per second. My baseline is very simple - If quantity of Inserts per second is greater than 10000 than MongoDB should not be used as queue repository.

 

I use this query to get an idea on how many I/Ops there are in the database. This will return quantity of inserted per second using field insertedDate

db.collection.aggregate(   [      {          $group: {  _id: { DateSt: { $dateToString: { format: "%Y-%m-%d-%H-%M-%S", date: "$ insertedDate " } }},"count":{$sum:1}           }      }   ] )

 

Before run this query I count inserted per minutes changing the format to: "%Y-%m-%d-%H-%M

After discovered the pick minutes I group by second (heavy query).

 

This old architecture has caused poor inserts performance because:

  1. Only one collection receives all events (1) then increase numbers of locks in same collection

  2. Single validation process runs (3), it is not possible create multi validation processes, because there is a risk that the same event will be send twice.

  3. Exist control mechanism (step 3.1 ) to determine if the event was send or not , the mechanism updates the field  "status" for each document in the collection  and consequently decreases insert performance.

 

 

  • The first change is create to 4 collections to receive events. The events are devided between collections using specific field in the document (ex: field Zone). process (1) reads this field value and insert the events in the correct collection. 

  •  

 

New Collections:

   Queue_zone_ center  -> receive events field zone = center

   Queue_zone_ south   -> receive events field zone = south

   Queue_zone_ north   -> receive events field zone = north

   Queue_zone_ others  -> receive events field others zones

 

 

  • For each collection exist one  validation process , compared to old architecture  that has only 1 process, now exist 4 process to valid the events , then 4x more faster.

  • Create new collection to control last event id.

 

 

 

 

 

 

 

 

 

 

Ex: collection_control_process:

Document :{name :Process_zone_north , last_id : 111,last_run:2017-01-05 23:13:13}

Document :{name :Process_zone_south , last_id : 443,last_run:2017-01-05 23:13:16}

Document :{name :Process_zone_center , last_id : 5623,last_run:2017-01-05 23:13:12}

Document :{name :Process_zone_others , last_id : 34,last_run:2017-01-05 23:13:12}

 

After cycle process zone north:

Before:

Document :{name :Process_zone_north , last_id : 111,last_run:2017-01-05 23:13:13}

 

Update to:

Document :{name :Process_zone_north , last_id : 987,last_run:2017-01-05 23:17:20}

 

My propose in this post is show back end using only Mongodb as repository and persistent, but remember others technologies can be more efficient, depends ofyour database workload.

 

 

Please reload

Featured Posts

I'm busy working on my blog posts. Watch this space!

Please reload

Recent Posts

October 31, 2017

October 29, 2017

Please reload

Archive
Please reload

Search By Tags