Home > Tech > Content

Elasticsearch Usage Guide

Tech Apr 22 18

Elasticsearch is an open-source search engine built on Apache Lucene™, designed to simplify full-text search by exposing a consistent RESTful API while handling the complexity of Lucene internally. Key features include:

Distributed real-time document storage with indexable fields
Real-time distributed search and analytics
Horizontal scalability supporting hundreds of nodes and petabytes of structured or unstructured data

Elasticsearch operates on documents rather than rows and columns, enabling powerful full-text search capabilities.

Operations

Indexing

Data in Elasticsearch is stored in indices. An index can contain multiple document types, each holding numerous documents with various properties. A document's path is structured as:

/index/type/id

For example: "/corporate/employee/123" represents an employee document with ID 123 in the "corporate" index and "employee" type.

Basic Document Operations

Insert: POST /corporate/employee/_doc/123

{
  "name": "Alex",
  "department": "Engineering",
  "position": "Software Developer",
  "skills": ["Java", "Python", "Elasticsearch"]
}

Update: PUT /corporate/employeee/_doc/123
Delete: DELETE /corporate/employee/_doc/123
Check Existence: HEAD /corporate/employee/_doc/123

Search Types

Query DSL Search

Search criteria are specified in the request body:

GET /corporate/employee/_search
{
  "query": {
    "match": {
      "department": "Engineering"
    }
  }
}

Lightweight Search

Search parameters are passed directly in the URL:

GET /corporate/employee/_search?q=department:Engineering

Query Examples

Phrase Matching

GET /corporate/employee/_search
{
  "query": {
    "match_phrase": {
      "skills": "Java Python"
    }
  }
}

Returns documents where "Java" and "Python" appear consecutively in the skills field.

Highlighting

Add highlighted matches to the response:

GET /corporate/employee/_search
{
  "query": {
    "match": {
      "about": "data analysis"
    }
  },
  "highlight": {
    "fields": {
      "about": {}
    }
  }
}

Index Management

Global Search: GET /_search
Multi-Index Search: GET /index1,index2/_search
Pagination: GET /_search?from=0&size=10
Filter Search: GET /products/_search ``` { "query": { "bool": { "must": [ {"term": {"status": "active"}}, {"range": {"price": {"gte": 100, "lte": 500}}} ], "must_not": [{"term": {"category": "deprecated"}}] } } }

Aggregations

Aggregations group documents into buckets and calculate metrics on them.

Simple Aggreagtion

Count documents by a specific field:

GET /vehicles/sales/_search
{
  "size": 0,
  "aggs": {
    "popular_colors": {
      "terms": {
        "field": "color"
      }
    }
  }
}

Result:

{
  "aggregations": {
    "popular_colors": {
      "buckets": [
        {"key": "red", "doc_count": 120},
        {"key": "blue", "doc_count": 80},
        {"key": "green", "doc_count": 50}
      ]
    }
  }
}

Combined Search and Aggregation

GET /vehicles/sales/_search
{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "price_stats": {
      "stats": {
        "field": "price"
      }
    }
  }
}

Cluster Operations

Cluster Health

GET /_cluster/health

Status values:

green: All shards and replicas are active.
yellow: All primary shards are active, but some replicas are missing.
red: Some primary shards are missing or failed.

Index Configuration

Set index settings at creation:

PUT /blogs
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  }
}

Shards are distributed across nodes for load balancing and redundancy.

Node Types

Nodes in Elasticsearch can be:

Master nodes: Manage cluster-wide operations like index creation.
Data nodes: Store data and perform searches.
Ingest nodes: Preprocess documents before indexing.

Shard Allocation

Elasticsearch automatically balances shards across nodes and maintains redundancy through replication. When adding or removing nodes:

Shards are reallocated to maintain balance.
Replicas are created to ensure data availability.
Queries are routed to the appropriate nodes.

Clusters scale horizontally by adding more nodes, with Elasticsearch handling the distribution transparently.

Tags: elasticsearch Search Engine Big Data

Back to List

Prev: Mitigating XSS and CSRF Vulnerabilities in Django Applications

Next: React Advanced Concepts: Context, Hooks, Redux, Routing, and Configuration

Fading Coder

Elasticsearch Usage Guide

Operations

Indexing

Basic Document Operations

Search Types

Query DSL Search

Lightweight Search

Query Examples

Phrase Matching

Highlighting

Index Management

Aggregations

Simple Aggreagtion

Combined Search and Aggregation

Cluster Operations

Cluster Health

Index Configuration

Node Types

Shard Allocation

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Elasticsearch Usage Guide

Operations

Indexing

Basic Document Operations

Search Types

Query DSL Search

Lightweight Search

Query Examples

Phrase Matching

Highlighting

Index Management

Aggregations

Simple Aggreagtion

Combined Search and Aggregation

Cluster Operations

Cluster Health

Index Configuration

Node Types

Shard Allocation

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment