Amazon OpenSearch (Elasticsearch): a practical guide for fast, scalable search
Sep 12, 2025•
awsopensearchelasticsearchsearch
• 0
Amazon OpenSearch Service (formerly Amazon Elasticsearch) is a managed search and analytics engine. It shines for full‑text search, log analytics, metrics, APM, and now vector search with k‑NN.
When OpenSearch is the right tool
- Need fast text search (prefix, fuzzy, relevance, highlighting)
- Query/aggregate semi‑structured JSON at scale
- Time‑series analytics (logs, metrics) with rollups/ILM
- Vector similarity search for RAG/semantic features
If you just need structured filters/joins, start with Postgres. If you need search relevance or large time‑series analytics, OpenSearch fits.
Core building blocks
- Index: Logical collection of documents. Choose shards/replicas per index.
- Document: JSON you index and query.
- Mapping: Field types and analyzer config (e.g., textvskeyword).
- Analyzer: Tokenization + filters; controls search behavior.
- ILM: Index lifecycle (hot → warm → cold → delete) for cost control.
- k‑NN: Vector fields and ANN indexes for semantic search.
Typical architectures
- Search for app data: App → (queue/stream) → indexer → OpenSearch; app queries via REST/SDK.
- Logs/metrics: App/agents → OpenSearch Ingestion / Firehose / Logstash → OpenSearch Dashboards.
- RAG: Embed text (SageMaker/Bedrock/OpenAI) → store vectors in OpenSearch k‑NN; hybrid BM25 + vector queries.
Index design and mappings
Choose field types deliberately:
- textfor full‑text search (with analyzer). Use- keywordsibling for exact filters/sorts.
- Dates as date. IDs askeyword. Numbers with appropriate numeric types.
- For arrays, OpenSearch treats each element as a separate value.
Minimal example mapping:
PUT my_articles
{
  "settings": { "number_of_shards": 3, "number_of_replicas": 1 },
  "mappings": {
    "properties": {
      "title":   { "type": "text", "analyzer": "standard", "fields": { "raw": { "type": "keyword" } } },
      "content": { "type": "text" },
      "tags":    { "type": "keyword" },
      "published_at": { "type": "date" },
      "embedding": { "type": "knn_vector", "dimension": 768 }
    }
  }
}
Query example (hybrid: keyword filter + text relevance):
POST my_articles/_search
{
  "query": {
    "bool": {
      "filter": { "terms": { "tags": ["aws", "search"] } },
      "must":   { "multi_match": { "query": "open search performance", "fields": ["title^3", "content"] } }
    }
  },
  "highlight": { "fields": { "content": {} } }
}
Vector similarity (k‑NN cosine):
POST my_articles/_search
{
  "size": 10,
  "query": {
    "knn": {
      "embedding": {
        "vector": [0.12, -0.04, ...],
        "k": 10
      }
    }
  }
}
Performance tuning that matters
- Shards: Start small. 1–3 primary shards per index is typical. Too many shards wastes memory and hurts query speed.
- Replicas: 1 replica for HA. Increase only to scale reads.
- Routing: If you have natural partitions (tenant ID), use custom routing to keep related docs in one shard.
- Refresh interval: Increase to 10–30s for heavy write throughput (default 1s). For log analytics, use -1during backfills then restore.
- Doc model: Prefer denormalization; avoid parent/child except when truly needed.
- Avoid deep pagination: Use search_afterinstead offrom/sizebeyond a few thousand.
- Warmers & caches: Pin frequent filters with filtercontext; OpenSearch caches filter bitsets.
Cost control playbook
- ILM: Hot (SSD) → warm (less CPU) → cold (UltraWarm/Cold) → delete after retention.
- Right‑size: Choose memory‑optimized for aggregations, storage‑optimized for logs. Use Graviton where available.
- Compression: Use best_compressionfor long‑lived analytic indices.
- Serverless: For spiky/low‑ops workloads, consider OpenSearch Serverless to offload capacity planning.
Security and access
- VPC‑only access when possible.
- Use IAM or Cognito; enable fine‑grained access control with document/field level security for multi‑tenant.
- Enforce HTTPS, rotate master user, restrict IPs, and turn on audit logs for sensitive data.
Operations & monitoring
- CloudWatch metrics: CPU, JVMMemoryPressure, MasterNotDiscovered, ClusterStatus. Alert early.
- Slow logs: Enable index/query slow logs to spot bad queries/mappings.
- Snapshots: Automated S3 snapshots for DR; practice restore.
- Versioning: Plan blue/green domain upgrades; test queries against the new version before cutover.
Ingestion options
- OpenSearch Ingestion (OSI) – managed, scalable pipelines.
- Kinesis Firehose – easy for logs/metrics.
- Logstash/Beats/Fluent Bit – agent‑based shipping.
- Lambda indexers – transform app data, enrich with embeddings, then index.
Quick checklist for production
-  Correct mappings (text+keywordfor titles, dates asdate)
- ILM with retention, hot→warm→cold
- Shards sized to data (tens of GB per shard; avoid thousands of shards)
- VPC + FGAC, IAM auth, audit logs
- Slow logs + CloudWatch alarms
- Snapshots to S3 (tested restore)
- Hybrid search (BM25 + vector) if using semantic features
OpenSearch can deliver snappy search and scalable analytics, but it rewards deliberate index design, shard planning, and lifecycle/cost tuning. Start lean, measure, and iterate.