Database Administration — Thu Apr 02

← Home | ← database-admin

NoSQL, OLAP, Transactional and Flat Files: What They Actually Are

Thu Apr 02
#databases #architecture #storage #olap #nosql #sql #flat-files #data-platforms

When people first get into data, they tend to hear the same words over and over again: transactional, OLAP, NoSQL, flat files.

At first they can sound like product categories, technical buzzwords, or just different names for “where data lives.” But they are not all the same thing, and confusing them can lead to some very bad decisions.

You do not want to run a reporting platform like a live banking system.
You do not want to store everything in CSVs and then wonder why nothing is reliable.
You do not want to store everything in In SQL because the developer prefers writing SQL.
And you definitely do not want to pick a NoSQL database just because it sounds modern.

From a DBA point of view, this is about performance, reliability, storage behaviour, indexing, concurrency, and recovery.
From an Architecture point of view, this is about choosing the right pattern for the right workload.

Transactional systems

A transactional system is built to support the day-to-day running of a business or application.

This is where the live work happens:

  • a customer places an order
  • a payment is recorded
  • an engineer closes a work order
  • a user updates their password
  • a stock level changes

These systems are usually designed around small, fast, highly accurate operations.

They care deeply about things like:

  • atomicity
  • consistency
  • locking
  • rollback
  • concurrency
  • data integrity

In other words, they are built to answer questions like:

  • Has this payment gone through?
  • What is the current address for this customer?
  • Is this order complete?
  • Which turbine is currently in alarm?

This is usually the world of relational databases such as:

  • PostgreSQL
  • MySQL
  • SQL Server
  • Oracle

What transactional systems are good at

Transactional platforms are strong when you need:

  • frequent inserts, updates and deletes
  • reliable single-record lookups
  • strict integrity
  • many users interacting at once
  • confidence that the current state is correct

What transactional systems are not good at

They are usually a poor fit for:

  • giant historical reporting queries
  • huge table scans across years of data
  • complex analytical joins across massive datasets
  • dashboard workloads hitting the same operational database all day

Can they do some of that? Yes.
Should they be your main analytics engine? Usually not.

Easy way to think about it

Transactional = running the business right now


OLAP systems

OLAP stands for Online Analytical Processing.

This is not mainly about running the business in real time.
It is about understanding the business over time.

Instead of asking:

  • Did customer 123 pay their invoice?

OLAP asks:

  • Which customer segments have become less profitable over 3 years?
  • Which turbine models fail more often in winter conditions?
  • What is the average output by site, region and month?
  • How did failures change before and after a maintenance intervention?

OLAP systems are designed for:

  • large scans
  • aggregations
  • trend analysis
  • dashboards
  • reporting
  • historical comparisons

This is where data warehouses and analytical platforms live, such as:

  • Snowflake
  • BigQuery
  • Redshift
  • Databricks SQL
  • Synapse
  • ClickHouse
  • sometimes PostgreSQL or TimescaleDB for smaller analytical estates

What OLAP is good at

OLAP platforms are strong when you need:

  • big queries over lots of rows
  • grouped summaries
  • historical analysis
  • read-heavy workloads
  • business intelligence and reporting

What OLAP is not good at

They are usually not the best choice for:

  • lots of row-by-row live updates
  • highly concurrent operational transactions
  • low-latency application writes
  • systems where every single update must be instantly visible as the source of truth

Easy way to think about it

OLAP = understanding the business


NoSQL systems

NoSQL is a broad category, not one single thing.

It usually means a database that does not rely primarily on the traditional relational table model.

That does not mean “better than SQL.”
It does not mean “faster in all cases.”
And it definitely does not mean “future-proof.”

It means the data model or storage model is different.

There are several major NoSQL patterns.

Document databases

These store data as documents, often JSON-like structures.

Examples:

  • MongoDB
  • Couchbase

Good for:

  • flexible application data
  • records with varying shapes
  • nested structures
  • rapidly changing schemas

Key-value stores

These store data as a key and a value.

Examples:

  • Redis
  • DynamoDB

Good for:

  • caching
  • session state
  • very fast lookups
  • simple access patterns at scale

Column-family stores

Examples:

  • Cassandra
  • HBase

Good for:

  • very large distributed workloads
  • high write throughput
  • systems built around specific access patterns

Graph databases

Examples:

  • Neo4j
  • Amazon Neptune

Good for:

  • relationship-heavy data
  • traversing networks
  • fraud models
  • recommendation engines
  • linked asset or dependency analysis

What NoSQL is good at

NoSQL can be a very good fit when you need:

  • flexible schemas
  • semi-structured data
  • massive scale
  • specialised access patterns
  • fast performance in a narrow problem space

What NoSQL is not good at

NoSQL is often a poor fit when you need:

  • rich relational joins
  • strict ACID transactional behaviour across complex datasets
  • strong consistency in every scenario
  • familiar SQL-style analytics without extra tooling

Easy way to think about it

NoSQL = flexible or specialised storage for cases where traditional relational design is not the best fit


Flat files

Flat files are the simplest of the lot.

This is data stored in files rather than managed by a database engine.

Examples include:

  • CSV
  • TSV
  • JSON files
  • XML
  • text logs
  • Excel files
  • Parquet files
  • Avro files

Some are plain and crude.
Some are highly efficient and modern.
But they are still files.

Flat files are extremely common in real-world data work because they are useful for:

  • imports and exports
  • data exchange between systems
  • archival storage
  • raw ingestion zones
  • batch processing
  • moving data through pipelines

What flat files are good at

Flat files are strong when you need:

  • a simple handoff format
  • cheap storage
  • interoperability
  • batch processing
  • raw landing zones in a lake or object store

What flat files are not good at

Flat files usually do not give you database features such as:

  • row-level transactions
  • locking
  • concurrency control
  • constraints
  • live indexing
  • multi-user update safety

A CSV can be useful, but it is not a database.
A shared spreadsheet can be convenient, but it is not a platform strategy.
And a folder full of exports is not governance.

Easy way to think about it

Flat files = data stored as files, usually for movement, staging, exchange or simple storage


So what is the actual difference?

The simplest version is this:

Transactional

Used to operate the system.

OLAP

Used to analyse the system.

NoSQL

Used when you need flexibility or specialised scale/access patterns.

Flat files

Used to store or move data outside a database engine.

That is the simple answer.

The more important answer is that these are not rivals in a cage fight.
A good data estate will often use all four.


A practical example

Imagine a wind energy company.

Transactional

A live operational database stores:

  • site details
  • turbine metadata
  • maintenance records
  • engineer actions
  • active alarms

This supports business operations and field processes.

OLAP

A warehouse or lakehouse stores:

  • years of SCADA history
  • alarm trends
  • failure rates
  • weather joins
  • asset performance summaries

This supports reporting, data science and strategic analysis.

NoSQL

A document or key-value platform might store:

  • raw JSON device payloads
  • flexible metadata from multiple OEMs
  • cached application state
  • configuration data with variable shape

Flat files

Files in S3, blob storage or on-prem storage might contain:

  • CSV exports from old vendor systems
  • Parquet telemetry extracts
  • JSON logs
  • bulk file drops from third parties

That is a very normal architecture.

Not because one pattern failed.
Because different patterns solve different problems.


The mistake people make

The common mistake is trying to force one storage style to do everything.

Examples:

  • using a transactional database as the main BI engine
  • using flat files as a long-term source of truth without governance
  • choosing NoSQL because it sounds scalable without understanding the query needs
  • trying to run operational applications directly on an OLAP warehouse
  • expecting CSV folders to behave like a managed database

Every platform has trade-offs.

A big part of architecture is not asking, “Which one is best?”
It is asking, “Best for what?”


How I’d choose between them

If I am deciding where data should live, I usually start with the workload.

Choose transactional when:

  • the data changes constantly
  • users or apps are updating it live
  • integrity matters more than large-scale analysis
  • current state is the priority

Choose OLAP when:

  • you need dashboards, reporting or trend analysis
  • the data volume is large
  • the workload is mostly read-heavy
  • you want to join and summarise across history

Choose NoSQL when:

  • the schema is flexible or evolving
  • the access pattern is narrow and specific
  • scale or latency requirements push beyond a traditional relational comfort zone
  • document, graph or key-value structures fit naturally

Choose flat files when:

  • you need exchange formats
  • raw data lands in batches
  • storage cost matters
  • you are building pipelines
  • the file format supports the next stage well

Why this matters for DBAs and Architects

This topic lands in both camps because the two perspectives are connected.

A DBA will ask:

  • How does this behave under load?
  • What are the locking and indexing implications?
  • How do backups and recovery work?
  • What happens when concurrency increases?
  • How will this be maintained?

An Architect will ask:

  • Is this the right store for the workload?
  • How does data move between systems?
  • What is the source of truth?
  • What belongs in the operational layer versus analytical layer?
  • What trade-offs are acceptable?

You need both views.

Because a system can be theoretically elegant and still be awful to operate.
And it can be technically stable while being totally wrong for the business need.


Final thought

These four categories are not really about technology fashion or personal workload preference.
They are about fit.

  • Transactional systems keep the business moving.
  • OLAP systems help the business understand itself.
  • NoSQL systems handle flexible or specialised workloads.
  • Flat files keep data portable, cheap and easy to move.

Once you understand that, the conversation gets much easier.

You stop asking, “Should we use NoSQL or SQL?”
And start asking better questions:

  • What is the workload?
  • What shape is the data?
  • What level of integrity is required?
  • Who needs to read it?
  • Who needs to write it?
  • Is this for operations or analysis?
  • What will hurt six months from now if we get this wrong?

That is where good database administration meets good architecture.

Gareth Winterman