Home > Java > javaTutorial > Learn about Druid caching technology

Learn about Druid caching technology

WBOY
Release: 2023-06-21 14:13:19
Original
1150 people have browsed it

Druid is an open source distributed data storage technology for real-time data analysis. It has the characteristics of high performance, low latency, and scalability. In order to further improve the performance and reliability of Druid, the Druid development team developed caching technology. This article mainly introduces the relevant knowledge of Druid caching.

1. Druid cache overview

Druid cache is divided into two types: one is the result cache on the Broker, and the other is the data cache on the Historical node. The role of caching is mainly to reduce the time it takes Druid to query data and reduce the query load.

  1. Result cache on Broker

The result cache on Broker is the cache of query results. Once the results are cached, subsequent queries can be directly retrieved from the cache. Obtain. The result cache is stored on the Broker's local disk, and the query result lifecycle is configurable and is 5 minutes by default. Query caching is generally used in scenarios that require high query response speed.

  1. Data cache on Historical node

The data cache on Historical node is a cache of data blocks. The Historical node is responsible for storing data blocks. When the Historical node receives a query request, if the queried data block is already in the local cache, the Historical node reads the data block directly from the cache and returns the result. If the data block is not in the cache, the Historical node needs to obtain the data block from other nodes in the cluster or data source and cache it. Data caching is one of the most important features of Druid, and can greatly improve query performance and response speed in many scenarios.

2. How to use Druid cache

You need to pay attention to the following points when using cache in Druid:

  1. Enable caching in queries

Druid does not enable caching by default, and you need to explicitly specify the cache when querying. When querying, you can enable result caching or data block caching by setting corresponding parameters. The query parameters are as follows:

(1) useResultCache: set to true to enable result caching, the default is false;

(2) useCache: set to true to enable data block caching, the default is false .

  1. Configuring cache

Druid’s cache is configurable, and users can set the size, life cycle and other parameters of the cache according to their actual needs. The parameters of the cache configuration are as follows:

(1) QueryCacheSize: The maximum size of the result cache, the default value is 500MB;

(2) segmentQueryCacheSize: The maximum size of the data block cache, the default is 0;

(3) resultCacheMaxSizeBytes: The maximum size of a single query result cache, the default is 10485760 bytes (10MB);

(4) resultCacheExpire: The life cycle of the query result cache, the default is 5 minutes.

3. Cache optimization

The optimization of Druid cache mainly includes the following points:

  1. Cache clearing strategy

When caching When the maximum capacity is reached or certain conditions are met, part of the cache needs to be cleared. By default, Druid cache clears some expired caches to free up more space. In addition, users can define their own clearing strategies and implement corresponding interfaces.

  1. Reasonably set the cache size

The setting of the cache size directly affects the storage capacity and efficiency of the cache. If the cache size is set too small, the cache will not be able to store enough data blocks or query results, thus affecting the performance of Druid queries; if the cache size is set too large, too many memory resources will be occupied, resulting in reduced query performance. Therefore, it needs to be adjusted according to the actual scenario to achieve optimal performance.

  1. Reasonably set the cache life cycle

If the cache life cycle is set too long, the memory resources occupied by the cache will not be released for a long time, affecting the performance of Druid queries; cache life cycle If it is too short, the cache hit rate will be reduced, which will also affect the performance of Druid queries. Therefore, the cache life cycle needs to be adjusted according to actual scenarios to achieve optimal performance.

Summary:

Druid caching is an important way to optimize Druid query performance. Result caching and data block caching each have different advantages and disadvantages, and users need to choose the appropriate caching method based on specific scenarios. When using Druid cache, you need to pay attention to cache enablement and configuration, and adjust and optimize it according to actual scenarios.

The above is the detailed content of Learn about Druid caching technology. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template