Comparisons

Apache Doris vs Elasticsearch

Elasticsearch and Apache Doris are both popular in observability, cybersecurity, and real-time analytics. However, Elasticsearch can be costly in terms of storage and write resources. Apache Doris reduces these costs through efficient storage and high compression, and offers comprehensive analytical capabilities, such as JOIN and superior query performance.

Featured Migration Cases

Guance

“By replacing Elasticsearch with the Doris Commercial Distributed Version supported by VeloDB, GuanceDB showcases a big stride in improving data processing speed and reducing costs.”

70%arrow

Cost reduction

2~3xarrow

Faster full-text search performance

VARIANT (Data Type)

Flexible to handle semi-structured data in log tracing

bestpay

“Previously, we used multiple components for complex security analysis... Adopting Doris as a unified solution has significantly improved data writes, query performance and storage efficiency.”

4xarrow

Faster write speeds

3xarrow

Better query performance

50%arrow

Storage space savings

zto

“Compared to the original OLAP database, query performance has improved 5-10 times, concurrency has doubled, and analysis time has dropped from 10 minutes to under 1 minute for 90% of cases, all while using just one-third of the original resources.”

2xarrow

Increasing report analysis concurrency

65%arrow

Storage space reduction

SQL

Simplified query with standard SQL

Why Choose Apache Doris

Apache Doris

  • Open Source License
    Licensed under Apache License 2.0
    Stable License since governed by the Apache Software Foundation
  • Architecture

    Higher flexibility and elasticity:

    Strict workload isolation by workload group, powered by Linux CGroups, ideal for multi-tenancy
    Compute-Storage decoupled and coupled modes
  • Deployment

    Supports three deployment options:

    Cloud-native services on AWS, Azure, and GCP, as well as SaaS and BYOC versions, supported by VeloDB ( a commercial company founded by Apache Doris creators)

    On-premise deployment, with extended long-term support from VeloDB

  • Real-Time Data Writes
    High throughput: Indexing only on one replica
    Pull-based ingestion via Kafka CDC, easier and simpler
    Support Logstash and Beats output plugin
  • Real-Time Data Storage
    Low storage consumption with compression rates up to 1:5 - 1:10
    Unique model supports both write and read optimization (MoW & MoR), retaining 90% of write speed when data is duplicated by key
    Aggregation model supports strong consistency, allows aggregated data updates, and coexists with original data
    Flexible Schema Change to meet dynamic business needs
  • Real-Time Data Queries
    Lightning-Fast in various query workloads
    Supports multi-table JOINs and optimization for complex analysis
    Easy to use with standard SQL
    Open MySQL ecosystem

Elasticsearch

  • Open Source License
    License changed from Apache License 2.0 to Elastic License, then to AGPL License
    Changing license since governed by Elastic NV
  • Architecture

    Traditional deployment with limited elasticity:

    Soft Workload Isolation by Thread Group
    Does not support decoupling compute and storage
  • Deployment

    Supports only two deployment options:

    Commercial distributed package only supports Cloud SaaS and on-premise
  • Real-Time Data Writes
    Low throughput: Indexing for multiple data replicas
    Requires additional tools like Logstash and Beats for pull-based ingestion, less convenient
  • Real-Time Data Storage
    High storage consumption with a compression ratio of 1:1.5
    Unique model only supports write optimization, with write performance loss up to 3 times
    The aggregation model does not allow aggregated data to be updated and does not coexist with the original data
    Limited support for Schema Change
  • Real-Time Data Queries
    Good at point queries, but not suited for data analysis
    No support for multi-table JOINs or complex analysis
    Difficult for users due to custom DSL
    Proprietary Elasticsearch ecosystem

Performance Comparison

cyber security icon

Observability & Cyber Security

The HTTP Logs benchmark is an official Elasticsearch performance test designed for log storage and analysis. It uses a real-world HTTP log dataset to evaluate indexing performance, storage efficiency, and query performance.

This benchmark comprises 11 queries commonly used in log analysis scenarios, including keyword search, time range queries, aggregations, and sorting. As a result, it is highly suitable for assessing performance in observability and network security analysis contexts.

ClickBench  Benchmark
Real-Time Analytics icon

Real-Time Analytics

ClickBench is a benchmarking tool to evaluate the performance of analytical databases. It focuses on testing the performance of large, flat tables rather than complex multi-table joins. It uses real-world data from a major web analytics platform, covering typical scenarios such as clickstream analysis and structured logs.

The benchmark consists of a set of queries that test aggregation operations and single-table performance, without involving complex joins. This makes it especially useful for evaluating databases optimized for real-time analytics and large-scale data processing.

Note: These test results are archived benchmarks captured in December 2024. Current real-time comparisons are maintained at ClickBench.

ClickBench  Benchmark