Building a custom Enterprise Resource Planning (ERP) system is no small feat. Here's what we learned designing and implementing a scalable, modular ERP platform from scratch.
The Problem We Solved
Most companies start with off-the-shelf ERP solutions like SAP or Oracle. But what happens when your business needs are too specific? What if you need:
- Custom workflows unique to your industry
- Integration with legacy systems that vendors won't support
- Real-time data across distributed teams
- The ability to pivot features without vendor locks
We faced this exact challenge. So we built our own. Here's how.
Our Tech Stack
Backend: Node.js + NestJS
We chose Node.js with NestJS as our foundation:
Why?
- TypeScript for enterprise-grade type safety
- Easy to scale horizontally
- Rich ecosystem for business logic
- Fast development iteration
// Example: Core business module structure
@Module({
imports: [DatabaseModule, AuthModule],
controllers: [InventoryController, OrderController],
providers: [InventoryService, OrderService],
})
export class BusinessModule {}
Pain point: Event-driven updates across modules became complex. Solution: We implemented a message queue pattern with Bull/Redis.
Database: PostgreSQL + Redis
PostgreSQL for transactional data:
- ACID compliance (critical for financial records)
- Complex joins for reports
- Powerful JSON support for semi-structured data
Redis for caching & real-time features:
- Session management
- Real-time inventory updates
- Rate limiting & queue management
-- Example: Inventory table design
CREATE TABLE inventory (
id UUID PRIMARY KEY,
sku VARCHAR(50) UNIQUE NOT NULL,
quantity_on_hand INT NOT NULL,
quantity_reserved INT DEFAULT 0,
last_updated TIMESTAMP,
warehouse_id UUID REFERENCES warehouses(id)
);
Key lesson: Don't underestimate the importance of indexing. A missing index on warehouse_id caused our inventory queries to take 30+ seconds on 1M+ records.
Frontend: React + TypeScript
Built with:
- React Query for server state management
- Tailwind CSS for consistent UI
- Zustand for client state
- Vite for fast bundling
The dashboard handles real-time updates via WebSockets—critical for monitoring inventory, orders, and financials simultaneously.
Infrastructure: Docker + Kubernetes
Containerized everything:
- Services run in Docker containers
- Orchestrated with Kubernetes
- Auto-scaling based on CPU/memory
- Separate environments: dev → staging → production
# Example: Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: erp-api
spec:
replicas: 3
selector:
matchLabels:
app: erp-api
template:
metadata:
labels:
app: erp-api
spec:
containers:
- name: erp-api
image: vexio/erp-api:latest
resources:
limits:
memory: "512Mi"
cpu: "500m"
Architecture Decisions
1. Microservices vs Monolith
We started monolithic, then split.
Initially: Everything in one Node.js app—users, inventory, orders, accounting, reporting. Deployment was single-click. But scaling was painful. If inventory got hammered, the entire system degraded.
What changed: We broke it into domain-driven microservices:
- User Service (auth, permissions)
- Inventory Service (stock, warehouses)
- Order Service (sales orders, fulfillment)
- Accounting Service (GL, AP, AR)
- Reporting Service (BI, analytics)
Each service:
- Has its own database (no shared tables)
- Communicates via APIs & message queues
- Scales independently
Trade-off: More complex to manage, but operational efficiency improved 40%.
2. Event-Driven Communication
Order placed → triggers inventory deduction → triggers accounting entry.
We use Apache Kafka (or RabbitMQ as alternative):
// Example: Order Service publishes event
async createOrder(orderData) {
const order = await this.db.orders.create(orderData);
// Publish event for other services
await this.messageQueue.publish('order.created', {
orderId: order.id,
items: order.items,
customerID: order.customerId
});
return order;
}
// Inventory Service subscribes
messageQueue.subscribe('order.created', async (event) => {
await this.inventoryService.reserveItems(event.items);
});
Benefit: Services are decoupled. A slow accounting service doesn't block order creation.
3. Real-Time Updates with WebSockets
Managers need live dashboards. We use Socket.io for pushing updates:
// Server: When inventory changes
@Injectable()
export class InventoryGateway {
@WebSocketServer() server: Server;
async updateInventory(sku: string, quantity: number) {
await this.inventoryService.update(sku, quantity);
// Push to all connected clients
this.server.emit('inventory.updated', {
sku,
newQuantity: quantity,
timestamp: new Date()
});
}
}
Biggest Challenges We Faced
1. Data Consistency Across Services
The problem: Order Service reserves inventory, but Inventory Service is temporarily down. Orders process, but inventory never gets updated.
Solution: Saga pattern for distributed transactions.
async function processOrderSaga(order) {
try {
// Step 1: Reserve inventory
const reservation = await inventoryService.reserve(order.items);
// Step 2: Create accounting entries
await accountingService.recordSale(order.amount);
// Step 3: Update order status
await orderService.markProcessed(order.id);
} catch (error) {
// Rollback everything if any step fails
await inventoryService.releaseReservation(reservation);
throw error;
}
}
2. Reporting Performance
Real-time reports across millions of transactions = slow queries.
We implemented:
- Read replicas of the database for reporting
- Data warehouse (Snowflake/BigQuery) for historical analytics
- Materialized views for common reports
-- Materialized view: Monthly sales by region
CREATE MATERIALIZED VIEW sales_by_region_monthly AS
SELECT
EXTRACT(YEAR_MONTH FROM order_date) AS month,
region,
SUM(amount) AS total_sales,
COUNT(*) AS order_count
FROM orders
GROUP BY month, region;
-- Refresh every hour
REFRESH MATERIALIZED VIEW CONCURRENTLY sales_by_region_monthly;
3. Permission & Access Control
Complex organizational hierarchies = complex permissions. You can't just hardcode "user can view orders."
Solution: Role-Based Access Control (RBAC) with attribute-based rules:
// Define permissions
{
role: "regional_manager",
permissions: [
{ resource: "orders", actions: ["read", "update"], region: "ASIA" },
{ resource: "inventory", actions: ["read"], warehouse: "ANY" }
]
}
// Check before returning data
async getOrders(userId, filter) {
const user = await this.getUser(userId);
const allowedRegions = user.permissions
.filter(p => p.resource === 'orders' && p.actions.includes('read'))
.map(p => p.region);
return db.orders.find({ region: { $in: allowedRegions } });
}
Key Lessons Learned
1. Start Simple, Evolve Gradually
Don't architect for Netflix-scale on day one. We spent months on infrastructure we didn't need. MVP first, optimize later.
2. Database Design is Everything
A single missing index or poorly designed schema cascades into system-wide performance issues. Invest time upfront.
3. Logging & Monitoring Are Non-Negotiable
When something breaks in production (and it will), you need to know why. We use:
- ELK Stack (Elasticsearch, Logstash, Kibana) for centralized logging
- Prometheus + Grafana for metrics
- Sentry for error tracking
4. Documentation Saves Time
When you have 5+ services, teams need to understand:
- API contracts (OpenAPI/Swagger)
- Database schemas
- Event formats
- Deployment procedures
We use AsyncAPI for documenting message-driven events.
5. Testing is Critical
Unit tests catch logic bugs. But integration tests prevent system-wide failures:
// Integration test: Verify order → inventory → accounting flow
it('should create order and update all services', async () => {
const order = await orderService.create(testOrder);
// Verify inventory was reserved
const inventory = await inventoryService.getReservation(order.id);
expect(inventory.quantity).toBe(-10);
// Verify accounting recorded the transaction
const ledger = await accountingService.getEntries(order.id);
expect(ledger).toHaveLength(2); // Debit + Credit
});
What We'd Do Differently
- Invest in API design early - Changing contracts across services is painful
- Build observability from day one - Don't add monitoring after things break
- Use events for everything - Even internal service calls; makes debugging easier
- Separate read & write models - CQRS pattern simplifies complex domains
The Result
Today, our ERP system handles:
- 50+ integrated modules (Inventory, Accounting, HR, CRM, etc.)
- 10M+ transactions/day across multiple regions
- 99.95% uptime (we're still chasing 99.99%)
- Sub-200ms API response times (p95)
But more importantly: it's maintainable. New features take weeks, not months. Teams can own their own services. We can scale components independently.
Why This Matters
Building a custom ERP isn't for everyone. But if your business has unique needs—complex workflows, legacy integrations, specific compliance requirements—an off-the-shelf solution will constrain you.
At Vexio, we specialize in helping companies design and build custom ERP systems that actually fit their business. We've learned these lessons so you don't have to. Whether you need a full ERP build, integrations with existing systems, or help modernizing legacy infrastructure, we understand the technical and business challenges.
If you're building or planning an ERP system, check out how Vexio can help with enterprise solutions.
Questions?
Have you built custom enterprise systems? What was your biggest challenge?
Drop a comment below or reach out if you want to discuss ERP architecture, tech choices, or lessons learned.
Looking to Build or Upgrade Your ERP System?
If your company is considering a custom ERP, enterprise platform, or e-commerce solution, Vexio has the expertise to guide you through the technical decisions and architecture challenges we've covered here.
Visit Vexio → to explore our ERP, e-commerce, and enterprise solutions.
Resources: