Simplifying your back end using MongoDB as queue repository and persistent database

Marcel Balassiano SQL Server and MongoDB consulting

In this post, I intend to show a back-end architecture implemented for one of our clients. This architecture uses MongoDB as permanent database and queue repository. Is is important to know that MongoDB is not the best database to use as a repository as it is limited by amount of inserts per second. My baseline is very simple - If quantity of Inserts per second is greater than 10000 than MongoDB should not be used as queue repository.

I use this query to get an idea on how many I/Ops there are in the database. This will return quantity of inserted per second using field insertedDate

db.collection.aggregate( [ { $group: { _id: { DateSt: { $dateToString: { format: "%Y-%m-%d-%H-%M-%S", date: "$ insertedDate " } }},"count":{$sum:1} } } ] )

Before run this query I count inserted per minutes changing the format to: "%Y-%m-%d-%H-%M

After discovered the pick minutes I group by second (heavy query).

This old architecture has caused poor inserts performance because:

  1. Only one collection receives all events (1) then increase numbers of locks in same collection

  2. Single validation process runs (3), it is not possible create multi validation processes, because there is a risk that the same event will be send twice.

  3. Exist control mechanism (step 3.1 ) to determine if the event was send or not , the mechanism updates the field "status" for each document in the collection and consequently decreases insert performance.

  • The first change is create to 4 collections to receive events. The events are devided between collections using specific field in the document (ex: field Zone). process (1) reads this field value and insert the events in the correct collection.

New Collections:

Queue_zone_ center -> receive events field zone = center

Queue_zone_ south -> receive events field zone = south

Queue_zone_ north -> receive events field zone = north

Queue_zone_ others -> receive events field others zones

  • For each collection exist one validation process , compared to old architecture that has only 1 process, now exist 4 process to valid the events , then 4x more faster.

  • Create new collection to control last event id.

Ex: collection_control_process:

Document :{name :Process_zone_north , last_id : 111,last_run:2017-01-05 23:13:13}

Document :{name :Process_zone_south , last_id : 443,last_run:2017-01-05 23:13:16}

Document :{name :Process_zone_center , last_id : 5623,last_run:2017-01-05 23:13:12}

Document :{name :Process_zone_others , last_id : 34,last_run:2017-01-05 23:13:12}

After cycle process zone north:

Before:

Document :{name :Process_zone_north , last_id : 111,last_run:2017-01-05 23:13:13}

Update to:

Document :{name :Process_zone_north , last_id : 987,last_run:2017-01-05 23:17:20}

My propose in this post is show back end using only Mongodb as repository and persistent, but remember others technologies can be more efficient, depends ofyour database workload.

![endif]--![endif]--![endif]--![endif]--![endif]--![endif]--![endif]--![endif]--![endif]--![endif]--![endif]--![endif]--

#MongoDB #MarcelBalassiano #architecture

Featured Posts
Posts Are Coming Soon
Stay tuned...
Recent Posts
Archive
Search By Tags
No tags yet.
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square

Our Services Data

   Big Data & NoSQL

   Data Science

   Business Intelligence

   Relational Database

Software Development

   FullStsck Dev

   Data Engineering

   Spark Framework

   MicroServices

Products

  Tableau

About

   About Us       

   Careers

   Contact

Cloud

   AWS

   Azure

   GCP

Naya Technologies

71 Hanadiv st. Herzeliya, Israel 

Office: +972-(0)9-7465005

Fax: +972-(0)9-7465006

© 2018 by NAYA Technologies. All rights reserved | Privacy Policy | Terms & Conditions | Web Accessibility