CSV to Database Import: Complete Guide (MySQL, PostgreSQL, SQL Server) - 2025
Importing CSV files into databases is a fundamental skill for data professionals, developers, and analysts. Whether you're migrating data from legacy systems, importing bulk data for analysis, or setting up data pipelines, knowing how to efficiently import CSV data into different database systems is crucial for successful data management.
This comprehensive guide will teach you how to import CSV files into the three most popular database systems: MySQL, PostgreSQL, and SQL Server. You'll learn the specific commands, best practices, and troubleshooting techniques for each system, along with advanced strategies for handling large datasets and complex data types.
Understanding CSV to Database Import
Before diving into specific database systems, let's understand the general process and common challenges.
Import Process Overview
1. Data Preparation:
- Clean and validate CSV data
- Ensure consistent formatting
- Handle special characters and encoding
- Verify data types and constraints
2. Schema Design:
- Create appropriate table structure
- Choose correct data types
- Set up primary keys and indexes
- Define constraints and relationships
3. Import Execution:
- Use appropriate import method
- Handle errors and exceptions
- Monitor import progress
- Validate data integrity
4. Post-Import Validation:
- Verify row counts
- Check data quality
- Test queries and performance
- Document the process
Common Challenges
Data Type Mismatches:
- CSV data may not match expected database types
- Date formats can vary significantly
- Numeric data may have formatting issues
- Text data may contain special characters
Performance Issues:
- Large files can cause timeouts
- Memory constraints with very large datasets
- Index creation can slow down imports
- Network latency for remote databases
Error Handling:
- Invalid data rows causing import failures
- Constraint violations
- Duplicate key errors
- Encoding and character set issues
MySQL CSV Import
Basic LOAD DATA INFILE Command
Simple Import:
LOAD DATA INFILE '/path/to/your/file.csv'
INTO TABLE your_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
Advanced Import with Options:
LOAD DATA INFILE '/path/to/your/file.csv'
INTO TABLE your_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(column1, column2, column3, @dummy_column)
SET column4 = NOW();
MySQL Import Examples
Example 1: Basic Customer Data Import
-- Create table
CREATE TABLE customers (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE,
phone VARCHAR(20),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Import data
LOAD DATA INFILE '/path/to/customers.csv'
INTO TABLE customers
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(name, email, phone);
Example 2: Sales Data with Date Handling
-- Create table
CREATE TABLE sales (
id INT AUTO_INCREMENT PRIMARY KEY,
product_name VARCHAR(100),
quantity INT,
price DECIMAL(10,2),
sale_date DATE,
region VARCHAR(50)
);
-- Import with date conversion
LOAD DATA INFILE '/path/to/sales.csv'
INTO TABLE sales
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(product_name, quantity, price, @sale_date, region)
SET sale_date = STR_TO_DATE(@sale_date, '%Y-%m-%d');
Example 3: Handling NULL Values and Defaults
-- Create table with defaults
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
description TEXT,
price DECIMAL(10,2) DEFAULT 0.00,
category VARCHAR(50) DEFAULT 'Uncategorized',
active BOOLEAN DEFAULT TRUE
);
-- Import with NULL handling
LOAD DATA INFILE '/path/to/products.csv'
INTO TABLE products
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(name, @description, @price, @category, @active)
SET
description = NULLIF(@description, ''),
price = NULLIF(@price, ''),
category = NULLIF(@category, ''),
active = CASE WHEN @active = '1' THEN TRUE ELSE FALSE END;
MySQL Import Best Practices
1. Performance Optimization:
-- Disable indexes during import
ALTER TABLE your_table DISABLE KEYS;
-- Import data
LOAD DATA INFILE '/path/to/file.csv'
INTO TABLE your_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
-- Re-enable indexes
ALTER TABLE your_table ENABLE KEYS;
2. Error Handling:
-- Import with error handling
LOAD DATA INFILE '/path/to/file.csv'
INTO TABLE your_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(column1, column2, column3)
SET
column4 = CASE
WHEN column1 IS NULL THEN 'Unknown'
ELSE column1
END;
3. Batch Processing:
-- Process large files in batches
LOAD DATA INFILE '/path/to/large_file.csv'
INTO TABLE your_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(column1, column2, column3)
SET batch_id = @batch_id;
PostgreSQL CSV Import
Using COPY Command
Basic COPY Import:
COPY your_table FROM '/path/to/your/file.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',');
Advanced COPY with Options:
COPY your_table (column1, column2, column3)
FROM '/path/to/your/file.csv'
WITH (
FORMAT csv,
HEADER true,
DELIMITER ',',
QUOTE '"',
ESCAPE '"',
NULL ''
);
PostgreSQL Import Examples
Example 1: Basic Data Import
-- Create table
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE,
department VARCHAR(50),
salary DECIMAL(10,2),
hire_date DATE
);
-- Import data
COPY employees (first_name, last_name, email, department, salary, hire_date)
FROM '/path/to/employees.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',');
Example 2: Handling Different Data Types
-- Create table with various data types
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
description TEXT,
price DECIMAL(10,2),
in_stock BOOLEAN,
tags TEXT[],
metadata JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Import with type conversion
COPY products (name, description, price, in_stock, tags, metadata)
FROM '/path/to/products.csv'
WITH (
FORMAT csv,
HEADER true,
DELIMITER ',',
QUOTE '"',
NULL ''
);
Example 3: Import with Data Transformation
-- Create staging table
CREATE TEMP TABLE temp_import (
raw_data TEXT
);
-- Import raw data
COPY temp_import FROM '/path/to/raw_data.csv'
WITH (FORMAT csv, HEADER true);
-- Transform and insert
INSERT INTO final_table (column1, column2, column3)
SELECT
SPLIT_PART(raw_data, ',', 1)::INTEGER,
SPLIT_PART(raw_data, ',', 2),
SPLIT_PART(raw_data, ',', 3)::DECIMAL(10,2)
FROM temp_import;
PostgreSQL Import Best Practices
1. Performance Optimization:
-- Disable autocommit for large imports
BEGIN;
-- Import data
COPY your_table FROM '/path/to/file.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',');
-- Create indexes after import
CREATE INDEX idx_column1 ON your_table (column1);
CREATE INDEX idx_column2 ON your_table (column2);
COMMIT;
2. Error Handling:
-- Import with error logging
CREATE TABLE import_errors (
line_number INTEGER,
error_message TEXT,
raw_data TEXT
);
-- Use COPY with error handling
COPY your_table FROM '/path/to/file.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',')
ON CONFLICT (id) DO UPDATE SET
column1 = EXCLUDED.column1,
column2 = EXCLUDED.column2;
3. Parallel Processing:
-- Split large file and import in parallel
-- File 1
COPY your_table FROM '/path/to/file_part1.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',');
-- File 2
COPY your_table FROM '/path/to/file_part2.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',');
SQL Server CSV Import
Using BULK INSERT Command
Basic BULK INSERT:
BULK INSERT your_table
FROM 'C:\path\to\your\file.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
);
Advanced BULK INSERT with Options:
BULK INSERT your_table
FROM 'C:\path\to\your\file.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2,
FIELDQUOTE = '"',
KEEPNULLS,
TABLOCK
);
SQL Server Import Examples
Example 1: Basic Data Import
-- Create table
CREATE TABLE customers (
id INT IDENTITY(1,1) PRIMARY KEY,
name NVARCHAR(100) NOT NULL,
email NVARCHAR(100) UNIQUE,
phone NVARCHAR(20),
created_at DATETIME2 DEFAULT GETDATE()
);
-- Import data
BULK INSERT customers
FROM 'C:\path\to\customers.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
);
Example 2: Import with Data Transformation
-- Create staging table
CREATE TABLE temp_import (
raw_data NVARCHAR(MAX)
);
-- Import raw data
BULK INSERT temp_import
FROM 'C:\path\to\raw_data.csv'
WITH (
FIELDTERMINATOR = '\n',
ROWTERMINATOR = '\n',
FIRSTROW = 1
);
-- Transform and insert
INSERT INTO final_table (column1, column2, column3)
SELECT
PARSENAME(REPLACE(raw_data, ',', '.'), 3) AS column1,
PARSENAME(REPLACE(raw_data, ',', '.'), 2) AS column2,
PARSENAME(REPLACE(raw_data, ',', '.'), 1) AS column3
FROM temp_import;
-- Clean up
DROP TABLE temp_import;
Example 3: Import with Error Handling
-- Create error log table
CREATE TABLE import_errors (
line_number INT,
error_message NVARCHAR(MAX),
raw_data NVARCHAR(MAX)
);
-- Import with error handling
BEGIN TRY
BULK INSERT your_table
FROM 'C:\path\to\file.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2,
ERRORFILE = 'C:\path\to\errors.txt'
);
END TRY
BEGIN CATCH
INSERT INTO import_errors (error_message)
VALUES (ERROR_MESSAGE());
END CATCH;
SQL Server Import Best Practices
1. Performance Optimization:
-- Use TABLOCK for better performance
BULK INSERT your_table
FROM 'C:\path\to\file.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2,
TABLOCK,
BATCHSIZE = 10000
);
2. Memory Management:
-- Process large files in batches
DECLARE @batch_size INT = 10000;
DECLARE @current_row INT = 1;
WHILE @current_row < (SELECT COUNT(*) FROM your_table)
BEGIN
-- Process batch
BULK INSERT your_table
FROM 'C:\path\to\file.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = @current_row,
LASTROW = @current_row + @batch_size
);
SET @current_row = @current_row + @batch_size;
END;
Programmatic Import Solutions
Python Solutions
MySQL Import with Python:
import pandas as pd
import mysql.connector
from sqlalchemy import create_engine
def import_csv_to_mysql(csv_file, table_name, connection_params):
"""Import CSV file to MySQL database"""
# Read CSV file
df = pd.read_csv(csv_file)
# Create database connection
engine = create_engine(f"mysql+pymysql://{connection_params['user']}:{connection_params['password']}@{connection_params['host']}/{connection_params['database']}")
# Import to database
df.to_sql(table_name, engine, if_exists='append', index=False)
print(f"Successfully imported {len(df)} rows to {table_name}")
# Usage
connection_params = {
'host': 'localhost',
'database': 'mydb',
'user': 'myuser',
'password': 'mypassword'
}
import_csv_to_mysql('data.csv', 'my_table', connection_params)
PostgreSQL Import with Python:
import pandas as pd
import psycopg2
from sqlalchemy import create_engine
def import_csv_to_postgresql(csv_file, table_name, connection_params):
"""Import CSV file to PostgreSQL database"""
# Read CSV file
df = pd.read_csv(csv_file)
# Create database connection
engine = create_engine(f"postgresql://{connection_params['user']}:{connection_params['password']}@{connection_params['host']}/{connection_params['database']}")
# Import to database
df.to_sql(table_name, engine, if_exists='append', index=False)
print(f"Successfully imported {len(df)} rows to {table_name}")
# Usage
connection_params = {
'host': 'localhost',
'database': 'mydb',
'user': 'myuser',
'password': 'mypassword'
}
import_csv_to_postgresql('data.csv', 'my_table', connection_params)
SQL Server Import with Python:
import pandas as pd
import pyodbc
from sqlalchemy import create_engine
def import_csv_to_sqlserver(csv_file, table_name, connection_params):
"""Import CSV file to SQL Server database"""
# Read CSV file
df = pd.read_csv(csv_file)
# Create database connection
engine = create_engine(f"mssql+pyodbc://{connection_params['user']}:{connection_params['password']}@{connection_params['server']}/{connection_params['database']}?driver=ODBC+Driver+17+for+SQL+Server")
# Import to database
df.to_sql(table_name, engine, if_exists='append', index=False)
print(f"Successfully imported {len(df)} rows to {table_name}")
# Usage
connection_params = {
'server': 'localhost',
'database': 'mydb',
'user': 'myuser',
'password': 'mypassword'
}
import_csv_to_sqlserver('data.csv', 'my_table', connection_params)
Advanced Python Import with Error Handling
import pandas as pd
import logging
from sqlalchemy import create_engine
from sqlalchemy.exc import SQLAlchemyError
def robust_csv_import(csv_file, table_name, connection_string, chunk_size=10000):
"""Robust CSV import with error handling and chunking"""
# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
try:
# Create database connection
engine = create_engine(connection_string)
# Read CSV in chunks
chunk_count = 0
total_rows = 0
for chunk in pd.read_csv(csv_file, chunksize=chunk_size):
try:
# Import chunk
chunk.to_sql(table_name, engine, if_exists='append', index=False)
chunk_count += 1
total_rows += len(chunk)
logger.info(f"Imported chunk {chunk_count}: {len(chunk)} rows")
except SQLAlchemyError as e:
logger.error(f"Error importing chunk {chunk_count}: {str(e)}")
continue
logger.info(f"Import complete: {total_rows} rows imported in {chunk_count} chunks")
return True
except Exception as e:
logger.error(f"Fatal error during import: {str(e)}")
return False
# Usage
connection_string = "mysql+pymysql://user:password@localhost/database"
success = robust_csv_import('large_file.csv', 'my_table', connection_string, 5000)
Best Practices for CSV to Database Import
Data Preparation
1. Data Cleaning:
- Remove empty rows and columns
- Handle missing values appropriately
- Validate data types and formats
- Check for duplicate records
2. Schema Design:
- Choose appropriate data types
- Set up proper constraints
- Create necessary indexes
- Plan for future growth
3. Performance Optimization:
- Use appropriate import methods
- Consider batch processing for large files
- Disable indexes during import
- Use parallel processing when possible
Error Handling
1. Validation:
- Check data integrity before import
- Validate against business rules
- Handle constraint violations
- Log and report errors
2. Recovery:
- Implement rollback procedures
- Create backup strategies
- Plan for partial failures
- Monitor import progress
3. Monitoring:
- Track import performance
- Monitor system resources
- Set up alerts for failures
- Maintain audit trails
Common Issues and Solutions
Issue 1: Data Type Mismatches
Problem: CSV data doesn't match expected database types
Solutions:
- Use data type conversion functions
- Create staging tables for transformation
- Implement validation rules
- Use appropriate import tools
Issue 2: Performance Issues
Problem: Import is too slow or causes timeouts
Solutions:
- Use batch processing
- Optimize database settings
- Consider parallel processing
- Use appropriate import methods
Issue 3: Memory Constraints
Problem: Large files cause memory issues
Solutions:
- Use streaming import methods
- Process files in chunks
- Optimize memory usage
- Consider database-specific optimizations
Issue 4: Character Encoding Issues
Problem: Special characters not displaying correctly
Solutions:
- Use UTF-8 encoding consistently
- Handle BOM characters properly
- Configure database character sets
- Validate encoding before import
Conclusion
Importing CSV files into databases is a critical skill that requires understanding both the data and the target database system. The methods we've covered—native SQL commands, programmatic solutions, and best practices—each have their strengths and are suitable for different scenarios.
Choose Native SQL Commands when:
- Working with standard database systems
- Need maximum performance
- Have direct database access
- Want to leverage database-specific features
Choose Programmatic Solutions when:
- Need custom data transformation
- Working with multiple database systems
- Want to automate the process
- Need advanced error handling
Choose Hybrid Approaches when:
- Working with complex data structures
- Need both performance and flexibility
- Want to leverage multiple tools
- Have specific business requirements
Remember that successful CSV to database imports require careful planning, proper data preparation, and thorough testing. Always validate your results and implement proper error handling to ensure data integrity throughout the process.
For more CSV data processing tools and guides, explore our CSV Tools Hub or try our CSV to Excel Converter for data preparation.