MongoDB

Mongoose ORM: Middleware & Validation Guide

S

Sajan Acharya

Author

November 18, 2024
18 min read

Why Mongoose?

Mongoose provides a straight-forward, schema-based solution to model your application data. Unlike raw MongoDB, Mongoose gives you structure, validation, type casting, and powerful middleware hooks. It's perfect for Node.js applications that need data integrity and business logic enforcement.

Key Benefits of Mongoose:

  • Schema-based: Define clear data structures with validation rules
  • Middleware Hooks: Pre/post hooks for automatic logic execution
  • Type Casting: Automatic conversion of data types
  • Query Builder: Chainable, intuitive query API
  • Virtual Properties: Computed properties without database storage
  • Business Logic: Encapsulate application rules directly in models

Schema Definition & Type Safety

Schemas are the foundation of Mongoose. They define the structure and constraints for your documents:

const mongoose = require('mongoose');

const userSchema = new mongoose.Schema({
  // Basic field with type
  email: {
    type: String,
    required: true,
    unique: true,
    lowercase: true,
    trim: true,
    match: /.+@.+..+/  // Email validation
  },
  
  // Nested object
  profile: {
    firstName: String,
    lastName: String,
    age: {
      type: Number,
      min: 0,
      max: 150
    }
  },
  
  // Arrays
  tags: [String],
  posts: [{
    type: mongoose.Schema.Types.ObjectId,
    ref: 'Post'
  }],
  
  // Enum field
  role: {
    type: String,
    enum: ['admin', 'user', 'moderator'],
    default: 'user'
  },
  
  // Timestamps
  createdAt: {
    type: Date,
    default: Date.now
  },
  updatedAt: {
    type: Date,
    default: Date.now
  }
}, { timestamps: true });

const User = mongoose.model('User', userSchema);
module.exports = User;

Middleware (Hooks) - The Power of Mongoose

Middleware (hooks) are functions that execute at specific points in a document's lifecycle. They're incredibly powerful for automating business logic.

Pre-Save Hook - Password Hashing Example

Hash passwords before saving to database (never store plain text!):

const bcrypt = require('bcryptjs');

userSchema.pre('save', async function(next) {
  // Only hash if password is modified
  if (!this.isModified('password')) {
    return next();
  }
  
  try {
    // Hash password with 10 salt rounds
    const salt = await bcrypt.genSalt(10);
    this.password = await bcrypt.hash(this.password, salt);
    next();
  } catch (error) {
    next(error);
  }
});

// Usage
const user = new User({ email: 'user@example.com', password: 'secret123' });
await user.save(); // Password automatically hashed before saving

Post-Save Hook - Logging & Side Effects

Execute logic after a document is saved:

userSchema.post('save', async function(doc) {
  console.log(`User ${doc.email} was saved`);
  // Send welcome email
  await sendWelcomeEmail(doc.email);
  // Log to analytics
  await logEvent('user_created', { userId: doc._id });
});

userSchema.post('save', function(error, doc, next) {
  // Error handling in post hooks
  if (error.name === 'MongoServerError' && error.code === 11000) {
    next(new Error('Email already exists'));
  } else {
    next(error);
  }
});

Pre-Remove Hook - Cleanup Data

Delete related documents when parent is deleted:

userSchema.pre('remove', async function(next) {
  // Delete all posts by this user
  await mongoose.model('Post').deleteMany({ author: this._id });
  // Remove from group memberships
  await mongoose.model('Group').updateMany(
    { members: this._id },
    { $pull: { members: this._id } }
  );
  next();
});

Query Middleware - Auto-populate Related Data

Automatically populate references on every find:

// Pre-find hook - automatically populate posts
userSchema.pre('find', function(next) {
  if (this.options._recursed) {
    return next();
  }
  this.populate('posts');
  next();
});

userSchema.pre('findOne', function(next) {
  this.populate('posts');
  next();
});

// Usage - posts automatically loaded
const user = await User.findById(userId);
console.log(user.posts); // Already populated!

Validation - Enforce Data Integrity

Built-in Validators

const postSchema = new mongoose.Schema({
  title: {
    type: String,
    required: [true, 'Title is required'],  // Custom error message
    minlength: [5, 'Title must be 5+ chars'],
    maxlength: [100, 'Title max 100 chars'],
    trim: true
  },
  
  content: {
    type: String,
    required: true,
    minlength: 50
  },
  
  rating: {
    type: Number,
    min: [1, 'Rating must be 1-5'],
    max: [5, 'Rating must be 1-5']
  },
  
  tags: {
    type: [String],
    validate: {
      validator: function(tags) {
        return tags.length <= 10; // Max 10 tags
      },
      message: 'Cannot have more than 10 tags'
    }
  },
  
  publishedDate: {
    type: Date,
    validate: {
      validator: function(date) {
        return date <= new Date(); // Can't publish in future
      },
      message: 'Published date cannot be in the future'
    }
  }
});

Custom Validators

Create complex validation logic:

// Validate email domain
userSchema.path('email').validate(function(email) {
  return /^[^s@]+@[^s@]+.[^s@]+$/.test(email);
}, 'Invalid email format');

// Validate password strength
userSchema.path('password').validate(function(password) {
  // At least 8 chars, 1 uppercase, 1 number, 1 special char
  return /^(?=.*[A-Z])(?=.*d)(?=.*[@$!%*?&])[A-Za-zd@$!%*?&]{8,}$/.test(password);
}, 'Password must be 8+ chars with uppercase, number, and special char');

// Validate related data
userSchema.path('age').validate(async function(age) {
  if (age < 18) {
    const parent = await User.findOne({ 
      _id: { $ne: this._id }, 
      email: this.parentEmail 
    });
    return !!parent; // Must have valid parent
  }
  return true;
}, 'Minors must have parent account');

Virtual Properties - Computed Fields

Virtuals are properties that exist in memory but not in the database. Perfect for computed fields:

// Full name virtual
userSchema.virtual('fullName').get(function() {
  return `${this.profile.firstName} ${this.profile.lastName}`;
}).set(function(fullName) {
  const [first, last] = fullName.split(' ');
  this.profile.firstName = first;
  this.profile.lastName = last;
});

// Usage
const user = await User.findById(userId);
console.log(user.fullName); // Returns computed full name
user.fullName = 'John Doe'; // Sets first and last name

// Include virtuals in JSON output
const userSchema = new mongoose.Schema({
  // ... fields
}, { 
  toJSON: { virtuals: true },
  toObject: { virtuals: true }
});

// Age virtual - computed from birthDate
userSchema.virtual('age').get(function() {
  const today = new Date();
  let age = today.getFullYear() - this.birthDate.getFullYear();
  const monthDiff = today.getMonth() - this.birthDate.getMonth();
  if (monthDiff < 0 || (monthDiff === 0 && today.getDate() < this.birthDate.getDate())) {
    age--;
  }
  return age;
});

Querying & Relationships

Populate - Loading Related Documents

Populate references to other documents (like SQL JOINs):

// Define reference in schema
const postSchema = new mongoose.Schema({
  title: String,
  author: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User',  // Reference to User model
    required: true
  }
});

const Post = mongoose.model('Post', postSchema);

// Populate author details
const posts = await Post.find().populate('author');
// posts[0].author now contains full User document, not just ID

// Nested populate
const posts = await Post.find()
  .populate({
    path: 'author',
    select: 'email fullName', // Only fetch these fields
    populate: { path: 'company' } // Populate author's company too
  });

// Multiple populates
const user = await User.findById(userId)
  .populate('posts')
  .populate('groups')
  .populate('following');

Query Operators & Filtering

// Exact match
await User.find({ role: 'admin' });

// Comparison operators
await Post.find({ views: { $gt: 100 } }); // greater than
await Post.find({ rating: { $gte: 4 } }); // greater than or equal
await Post.find({ createdAt: { $lt: new Date('2024-01-01') } }); // less than

// IN operator
await Post.find({ status: { $in: ['published', 'scheduled'] } });

// Regular expressions
await User.find({ email: { $regex: /gmail/, $options: 'i' } }); // Case-insensitive

// Logical operators
await Post.find({
  $and: [
    { published: true },
    { createdAt: { $gte: new Date('2024-01-01') } },
    { $or: [{ author: userId1 }, { author: userId2 }] }
  ]
});

// Text search (requires text index)
userSchema.index({ email: 'text', name: 'text' });
await User.find({ $text: { $search: 'john' } });

Aggregation Pipeline - Complex Queries

Use aggregation for complex data transformation:

// Group posts by author and count
const results = await Post.aggregate([
  { $match: { published: true } },
  { $group: { 
      _id: '$author', 
      count: { $sum: 1 },
      avgRating: { $avg: '$rating' }
    } 
  },
  { $sort: { count: -1 } },
  { $limit: 10 }
]);

// Top 5 trending tags
const trends = await Post.aggregate([
  { $unwind: '$tags' },
  { $group: { _id: '$tags', count: { $sum: 1 } } },
  { $sort: { count: -1 } },
  { $limit: 5 }
]);

// User stats
const stats = await User.aggregate([
  { $lookup: {
      from: 'posts',
      localField: '_id',
      foreignField: 'author',
      as: 'userPosts'
    }
  },
  { $project: {
      email: 1,
      postCount: { $size: '$userPosts' },
      avgPostRating: { $avg: '$userPosts.rating' }
    }
  }
]);

Indexing - Performance Optimization

Indexes speed up queries significantly:

// Single field index
userSchema.index({ email: 1 }); // Ascending
userSchema.index({ createdAt: -1 }); // Descending

// Compound index (for multi-field queries)
postSchema.index({ author: 1, published: 1 });

// Unique index
userSchema.index({ email: 1 }, { unique: true });

// TTL index (auto-delete after expiration)
sessionSchema.index({ createdAt: 1 }, { expireAfterSeconds: 3600 }); // Delete after 1 hour

// Text index (for full-text search)
userSchema.index({ email: 'text', name: 'text' });

// Sparse index (only indexes documents with the field)
userSchema.index({ phoneNumber: 1 }, { sparse: true });

// Check created indexes
await User.collection.getIndexes();

Best Practices for Production

  • Always validate: Use schema validation to enforce data integrity at the database level.
  • Hash sensitive data: Use pre-save hooks to hash passwords and other sensitive data.
  • Use indexes wisely: Index frequently queried fields, but not every field (adds overhead).
  • Lean queries: Use .lean() for read-only queries to reduce memory overhead.
  • Batch operations: Use bulkWrite() for inserting/updating many documents.
  • Error handling: Handle validation errors gracefully in API responses.
  • Connection pooling: Configure connection pool size in production.
  • Monitoring: Monitor query performance with slow query logs.

Common Patterns & Gotchas

Avoid N+1 Queries

// BAD - N+1 queries
const users = await User.find();
for (const user of users) {
  user.posts = await Post.find({ author: user._id }); // Extra query per user!
}

// GOOD - Single query with populate
const users = await User.find().populate('posts');

Lean Queries for Performance

// Returns plain JavaScript objects (faster, less memory)
const posts = await Post.find().lean();

// Useful for read-only operations
const recentPosts = await Post.find()
  .lean()
  .select('title excerpt createdAt')
  .sort({ createdAt: -1 })
  .limit(10);

Bulk Operations

// Insert many documents efficiently
const docs = [
  { email: 'user1@example.com' },
  { email: 'user2@example.com' },
  { email: 'user3@example.com' }
];
await User.insertMany(docs, { ordered: false }); // Continue on error

// Bulk updates
const bulk = User.collection.initializeUnorderedBulkOp();
bulk.find({ status: 'inactive' }).update({ $set: { active: false } });
bulk.find({ role: 'admin' }).update({ $set: { permissions: [...] } });
await bulk.execute();

Conclusion

Mongoose transforms MongoDB from a schemaless database into a structured, validated data store with powerful middleware capabilities. By mastering schemas, validation, hooks, and relationships, you'll write more robust, maintainable Node.js applications. The combination of flexibility and structure makes Mongoose ideal for professional applications that need data integrity without sacrificing MongoDB's scalability.

Tags

#Mongoose#Node.js#Database#ORM

Share this article

About the Author

S

Sajan Acharya

Expert Writer & Developer

Sajan Acharya is an experienced software engineer and technology writer passionate about helping developers master modern web technologies. With years of professional experience in full-stack development, system design, and best practices, they bring real-world insights to every article.

Specializing in Next.js, TypeScript, Node.js, databases, and web performance optimization. Follow for more in-depth technical content.

Stay Updated

Get the latest articles delivered to your inbox