My Experience with MongoDB as an RDBMS User – Thoughts on Data Modeling and Design

 2025-04-10

 Programming

Recently, I’ve been building a headless CMS using Rust and MongoDB for my personal blog. Though I call it a “headless CMS,” it’s really a simple RESTful API that handles CRUD operations for users, articles, categories, and tags. In building this system, I chose MongoDB as the database. Since I’ve primarily used RDBMS (Relational Database Management Systems) in the past, working with a document-oriented database was a refreshing experience. I wanted to share some of my thoughts after using it.

What Kind of Database is MongoDB?

MongoDB is a document-oriented database. While relational databases handle data much like spreadsheet tables, document-oriented databases deal with data in string-based formats like JSON. Put simply, it’s like storing JSON objects directly. In practice, MongoDB converts JSON data into a binary format called BSON (Binary JSON) for storage.

This allows you to store nested data structures without the need for normalization. For instance, an object that contains other objects can be stored as-is. This means even queries that would require complex JOINs in a relational DB can be handled more simply in MongoDB.

That said, document-oriented databases use different terminology and concepts from RDBMS, and they don’t always map 1-to-1:

  • Field: A data item, roughly equivalent to a column in RDBMS.
  • Value: The value corresponding to a field.
  • Document: A collection of field-value pairs, similar to a record in RDBMS.
  • Collection: A group of documents, comparable to a table in RDBMS.

Why I Chose a Document-Oriented Database

If you’ve ever built a blog system, you know it’s relatively simple: articles have a many-to-one relationship with users and categories, and a many-to-many relationship with tags. Even with RDBMS, this setup isn’t overly complex. However, since I aimed to support both Japanese and English, I anticipated a bit more complexity and decided to try MongoDB. Honestly, a big reason was simply technical curiosity.

While ORMs help solve impedance mismatch—the inconsistency between DB structures and in-code models—I wondered if document-oriented databases might offer a more elegant solution.

You Can Design Like RDBMS… But Should You?

In document databases, related data can be embedded directly in the same document. This embedded model makes queries simpler and improves read performance. However, MongoDB also allows you to link documents across collections using IDs—akin to using foreign keys in RDBMS.

If you’re used to RDBMS, you might instinctively model your data using IDs to relate collections. But this can make queries more complex and performance worse compared to embedding data.

Pros and Cons of Linking Collections via IDs

  • Pros: Easier to update and maintain data integrity.
  • Cons: Heavier processing and more complex queries compared to embedded objects.

Watch Out When Embedding Data

While embedding related objects in a single document gives you simpler and faster queries, it comes with caveats—especially when it comes to updates and data consistency.

In a typical blog system, it’s common for multiple articles to share the same category. If categories are embedded and one changes, you’d need to update every related article to keep data consistent. With ID linking, you’d just update the master category document.

So when using embedding, the burden of maintaining data consistency falls on the developer. If you’re used to RDBMS, this can be a tricky mindset shift.

The Approach I Took for My Blog System

In my case, I created separate collections for categories and tags as master data, then embedded the same data into each article document.

Normally, you’d just reference the master data using IDs. But because my main focus was on retrieval (the “R” in CRUD), I prioritized query simplicity and performance—especially when fetching articles and their associated category/tag data.

When Using ID References

Articles Collection

{
    "_id" : ObjectId("YYYYYYYYYYYYYY"),
    "title" : "Article Title",
    "author" : "Kuro",
    ...
    "category" : ObjectId("XXXXXXXXXXXXXXXX")
}

Categories Collection

{
    "_id" : ObjectId("XXXXXXXXXXXXXXXX"),
    "name" : "Web Technology",
    "slug" : "web-tech"
}

When Using ID References

Articles Collection

{
    "_id" : ObjectId("YYYYYYYYYYYYYY"),
    "title" : "Article Title",
    "author" : "Kuro",
    ...
    "category" : {
        "_id" : ObjectId("XXXXXXXXXXXXXXXX"),
        "name" : "Web Technology",
        "slug" : "web-tech"
    }
}

Category Master Collection

{
    "_id" : ObjectId("XXXXXXXXXXXXXXXX"),
    "name" : "Web Technology",
    "slug" : "web-tech"
}

The downside of this approach is that whenever a category or tag is updated, you must also update all embedded instances in the article documents. You can’t rely on the database to maintain consistency. However, since updates to these master datasets are infrequent, I opted for a heavier update process when they do change.

There’s no one-size-fits-all solution for data modeling. For this blog system, though, I think this was the right choice.

Thoughts on Using MongoDB

The biggest win was how seamlessly I could store and retrieve objects without worrying about data mapping or ORM complexity. The document-oriented model really shines here.

That said, being used to RDBMS, I found myself unintentionally falling back on ID-based designs. I had to consciously switch gears to MongoDB’s embedding model to take full advantage of its performance benefits.

Since the system isn’t in production yet, I haven’t tested load or availability. But based on my development experience so far, I’ve been quite pleased with MongoDB.