Miller (mlr) CSV Guide

Learn how Miller can be used for record-aware CSV wrangling, filtering, joining, reshaping, and analysis from the command line.

What Miller is

Miller, often called mlr, is a command-line tool for working with structured tabular data such as CSV, TSV, and JSON. It is designed to make data wrangling easier in shell-based workflows by treating records and fields as structured data instead of only raw text lines.

That makes it especially useful when CSV files need to be filtered, joined, reshaped, grouped, or summarized from a terminal workflow without moving into a larger programming stack.

Why use Miller with CSV data

  • Work with CSV data from the command line in a structured way
  • Filter and reshape records without hand-rolling parsing logic
  • Join tabular datasets more reliably than line-based shell tools
  • Summarize, sort, and transform records in data pipelines
  • Handle mixed workflow environments involving CSV, TSV, and JSON

How Miller fits into a CSV workflow

CSV files are often processed in shell pipelines during backend automation, reporting, system integration, and operational troubleshooting. Traditional shell tools are powerful, but they usually treat each line as unstructured text, which can make field-based CSV work brittle and hard to maintain.

Miller helps bridge that gap by keeping the convenience of terminal workflows while adding record-aware operations. That makes it useful when you want the speed and composability of command-line data handling without giving up structured tabular logic.

Common use cases

Filtering records

Select only the rows you need from a CSV file using field-aware conditions instead of fragile text slicing.

Joining datasets

Combine related tabular files on shared keys when your workflow needs multi-source data wrangling from the terminal.

Reshaping data

Reformat records and fields as part of ETL-style pipelines, reporting preparation, or downstream automation.

Command-line analysis

Produce quick summaries, counts, and transformed outputs directly from the shell without building a full script or importing data into another tool first.

Miller vs plain shell text tools

Plain shell tools such as generic text processors are excellent for many tasks, but CSV can become tricky when quoting, delimiters, headers, and field-aware operations matter. Miller is designed to understand structured records, which makes many CSV tasks safer and easier to express.

That makes Miller a strong choice when the terminal is your natural working environment and the data still needs to be treated as structured tabular information rather than only strings and lines.

Related CSV resources

Frequently asked questions

What is Miller used for?

Miller is used for command-line wrangling of structured formats such as CSV, TSV, and JSON, including filtering, joins, reshaping, and summaries.

Why use Miller with CSV files?

Because it gives terminal workflows record-aware operations that are more structured and reliable than plain line-based text manipulation alone.

How is Miller different from plain text tools?

Miller understands structured fields and records, which makes CSV transformations and summaries more natural for tabular data.