


[Nightingale Monitoring] First time meeting Nightingale, still strong!
Preface
Observability is a headache for most small and medium-sized companies, which mainly manifests in the following aspects :
- Different open source software is required to assemble to achieve different functions, such as using Skywalking to implement link monitoring, using ELK to implement log collection and monitoring, and using Grafana Prometheus to implement indicator monitoring.
- Behind each open source software is an independent system. They were independent of each other before (Grafana Family Bucket has been combined).
- Data islands, links, logs, and indicators are all separate, and no connection is established. The solutions currently on the market are either commercial products or self-developed.
The protagonist of this article is actually not unified. At the current stage, different open source components are still used to implement different functions. However, N9e can view them on the same main panel, but the connection between the data Still hasn't happened.
Then why do we still need to study N9e?
Because it is developing in this direction.
As mentioned above, Grafana is already doing this. Based on the Grafana Loki Tempo Prometheus combination, monitoring, indicators, and links can be linked. What is the difference between N9e and Grafana?
In Mr. Qin’s words: Grafana is better at managing monitoring panels, and N9e is better at managing alarm rules.
N9e can send different alarm rules to different business groups and groups to avoid generating a large number of alarm messages in one group, which will lead to the story of the crying wolf over time.
Having said so much, what does N9e look like?
The following is a system I have deployed.
As you can see, on this panel, we can implement:
- Alarm management
- Time series indicator query
- Log analysis
- Link tracking
- Alarm self-healing
- Personnel management
- ....
In this way, you don’t need to switch back and forth between several applications, which is fast.
System Architecture
If you don’t understand the architecture, it will be in vain if you don’t understand the architecture.
Now let’s take a look at what the architecture of N9e looks like. Only by clarifying how N9e works from the architectural logic will be of great benefit to both deployment and maintenance.
N9e mainly has a central convergence deployment solution and an edge sinking hybrid deployment solution, which will be explained below.
Central converged deployment solution
First picture:
This solution is to establish an N9e cluster , the monitoring data of other regions are sent to this cluster, which requires a good network connection between the central cluster and other regions.
For the central cluster, it mainly includes the following components:
- MySQL: used to store configuration information and alarm events.
- Redis: used to store JWT Token, machine meta information and other data.
- TSDB: Time series database, which stores monitoring indicators.
- N9e: core service, handles web requests and provides alarm engine
- LB: Provides load function for multiple N9e.
For other Regions, you only need to deploy Categraf, which will push local monitoring data to the central cluster.
This architecture is characterized by simplicity and relatively low maintenance costs. The premise is that the network links between computer rooms must be relatively good. If the network is not good, the following solution must be used.
Edge sinking hybrid deployment solution
This architecture is a supplement to the central deployment solution, mainly for the network Bad situation:
- Move the time series database TSDB, forwarding gateway, and alarm engine to a specific Region, and let the Region itself handle it. However, the Region still needs to establish a heartbeat connection with the central cluster, and users can still view the monitoring information of other Regions through the monitoring panel of the central cluster.
- If you already have Prometheus, you can also directly connect Prometheus as a data source.
In the edge computer room, when deploying the timing library, alarm engine, and forwarding gateway, please note that the alarm engine needs to rely on the database because alarm rules need to be synchronized, and the forwarding gateway also needs to rely on the database because it requires To register objects in the database, you need to open the relevant network.
!! # PS: For this solution, the network itself is not good, and the network needs to be opened. Maybe It will still be affected by network problems.
Single-machine deployment
Why should we choose stand-alone deployment here?
Actually, I want to deploy each component next to each other, which will be helpful for understanding the entire N9e operating mode.
!! Tips: I am using Ubuntu 22.04.1 system
Install MySQL
It will start automatically after the installation is completed. Then set a user password for the database.##!! Tips : For the sake of speed, I installed Mariadb
# 更新镜像源 $ sudo apt-get update # 更新软件 $ sudo apt-get upgrade # 安装Mariabd $ sudo apt-get install mariadb-server-10.6Copy after login
# 连接数据库 $ sudo mysql # 设置权限和密码 > GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY '1234'; >flush privileges;
Install Redis# 更新镜像源
$ sudo apt-get update
# 更新软件
$ sudo apt-get upgrade
# 安装Redis
$ sudo apt install redis-server
Copy after login
It will start automatically by default. # 更新镜像源 $ sudo apt-get update # 更新软件 $ sudo apt-get upgrade # 安装Redis $ sudo apt install redis-server
Installing TSDB
There are many options for TSDB for N9e:- PrometheusM3DBVictoriaMetricsInfluxDBThanos
# 下载二进制包 $ wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.90.0/victoria-metrics-linux-amd64-v1.90.0.tar.gz # 解压 $ tar xf victoria-metrics-linux-amd64-v1.90.0.tar.gz # 启动 $ nohup ./victoria-metrics-prod &>victoria.log &
Install N9e# 下载最新版本的二进制包
$ wget https://github.com/ccfos/nightingale/releases/download/v6.0.0-ga.3/n9e-v6.0.0-ga.3-linux-amd64.tar.gz
# 解压
$ mkdir n9e
$ tar xf n9e-v6.0.0-ga.3-linux-amd64.tar.gz -C n9e/
# 检验目录如下
$ ll
total 35332
drwxrwxr-x7 jokerbai jokerbai 40964月 12 14:05 ./
drwxr-xr-x4 jokerbai jokerbai 40964月 12 14:05 ../
drwxrwxr-x3 jokerbai jokerbai 40964月 12 14:05 cli/
drwxrwxr-x 10 jokerbai jokerbai 40964月 12 14:05 docker/
drwxrwxr-x4 jokerbai jokerbai 40964月 12 14:09 etc/
drwxrwxr-x 20 jokerbai jokerbai 40964月 12 14:05 integrations/
-rwxr-xr-x1 jokerbai jokerbai 252805124月6 19:05 n9e*
-rwxr-xr-x1 jokerbai jokerbai 108380164月6 19:05 n9e-cli*
-rw-r--r--1 jokerbai jokerbai297844月6 19:04 n9e.sql
drwxrwxr-x6 jokerbai jokerbai 40964月 12 14:05 pub/
Copy after login
Then import the N9e database. # 下载最新版本的二进制包 $ wget https://github.com/ccfos/nightingale/releases/download/v6.0.0-ga.3/n9e-v6.0.0-ga.3-linux-amd64.tar.gz # 解压 $ mkdir n9e $ tar xf n9e-v6.0.0-ga.3-linux-amd64.tar.gz -C n9e/ # 检验目录如下 $ ll total 35332 drwxrwxr-x7 jokerbai jokerbai 40964月 12 14:05 ./ drwxr-xr-x4 jokerbai jokerbai 40964月 12 14:05 ../ drwxrwxr-x3 jokerbai jokerbai 40964月 12 14:05 cli/ drwxrwxr-x 10 jokerbai jokerbai 40964月 12 14:05 docker/ drwxrwxr-x4 jokerbai jokerbai 40964月 12 14:09 etc/ drwxrwxr-x 20 jokerbai jokerbai 40964月 12 14:05 integrations/ -rwxr-xr-x1 jokerbai jokerbai 252805124月6 19:05 n9e* -rwxr-xr-x1 jokerbai jokerbai 108380164月6 19:05 n9e-cli* -rw-r--r--1 jokerbai jokerbai297844月6 19:04 n9e.sql drwxrwxr-x6 jokerbai jokerbai 40964月 12 14:05 pub/
# 导入数据库 $ mysql -uroot -p <n9e.sql
[[Pushgw.Writers]] # Url = "http://127.0.0.1:8480/insert/0/prometheus/api/v1/write" Url = "http://127.0.0.1:8428/api/v1/write"
# 启动服务 $ nohup ./n9e &>n9e.log & # 检测17000端口是否启动 $ ss -ntl | grep 17000 LISTEN 04096 *:17000*:*
Enter http://127.0.0.1:17000 in the browser, then enter the username root and password root.2020 to log in to the system.
Installing Categraf
Categraf is a monitoring and collection Agent that will push the collected information to TSDB.# 下载 $ wget https://download.flashcat.cloud/categraf-v0.2.38-linux-amd64.tar.gz # 解压 $ tar xf categraf-v0.2.38-linux-amd64.tar.gz # 进入目录 $ cd categraf-v0.2.38-linux-amd64/
[[writers]] url = "http://127.0.0.1:17000/prometheus/v1/write" [heartbeat] enable = true
$ nohup ./categraf &>categraf.log &
Then you can see the basic information on the main interface.
Add data source
Now if you go to view the time series data indicators, you cannot query them. Because no data source has been added.
Add a data source in System Configuration->Data Source, as follows:
This article mainly provides a preliminary impression of Nightingale, briefly introduces its overall architecture, and then takes everyone from I have installed it from 0 to 1 to give everyone a clear understanding of the components of Nightingale.
At present, Nightingale has been updated to the V6 version. This version has many new functional attempts, such as access to ELK, access to Jaeger, etc. This series will continue to be updated in the future.
The above is the detailed content of [Nightingale Monitoring] First time meeting Nightingale, still strong!. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Text annotation is the work of corresponding labels or tags to specific content in text. Its main purpose is to provide additional information to the text for deeper analysis and processing, especially in the field of artificial intelligence. Text annotation is crucial for supervised machine learning tasks in artificial intelligence applications. It is used to train AI models to help more accurately understand natural language text information and improve the performance of tasks such as text classification, sentiment analysis, and language translation. Through text annotation, we can teach AI models to recognize entities in text, understand context, and make accurate predictions when new similar data appears. This article mainly recommends some better open source text annotation tools. 1.LabelStudiohttps://github.com/Hu

Image annotation is the process of associating labels or descriptive information with images to give deeper meaning and explanation to the image content. This process is critical to machine learning, which helps train vision models to more accurately identify individual elements in images. By adding annotations to images, the computer can understand the semantics and context behind the images, thereby improving the ability to understand and analyze the image content. Image annotation has a wide range of applications, covering many fields, such as computer vision, natural language processing, and graph vision models. It has a wide range of applications, such as assisting vehicles in identifying obstacles on the road, and helping in the detection and diagnosis of diseases through medical image recognition. . This article mainly recommends some better open source and free image annotation tools. 1.Makesens

Face detection and recognition technology is already a relatively mature and widely used technology. Currently, the most widely used Internet application language is JS. Implementing face detection and recognition on the Web front-end has advantages and disadvantages compared to back-end face recognition. Advantages include reducing network interaction and real-time recognition, which greatly shortens user waiting time and improves user experience; disadvantages include: being limited by model size, the accuracy is also limited. How to use js to implement face detection on the web? In order to implement face recognition on the Web, you need to be familiar with related programming languages and technologies, such as JavaScript, HTML, CSS, WebRTC, etc. At the same time, you also need to master relevant computer vision and artificial intelligence technologies. It is worth noting that due to the design of the Web side

Audiences familiar with "Westworld" know that this show is set in a huge high-tech adult theme park in the future world. The robots have similar behavioral capabilities to humans, and can remember what they see and hear, and repeat the core storyline. Every day, these robots will be reset and returned to their initial state. After the release of the Stanford paper "Generative Agents: Interactive Simulacra of Human Behavior", this scenario is no longer limited to movies and TV series. AI has successfully reproduced this scene in Smallville's "Virtual Town" 》Overview map paper address: https://arxiv.org/pdf/2304.03442v1.pdf

New SOTA for multimodal document understanding capabilities! Alibaba's mPLUG team released the latest open source work mPLUG-DocOwl1.5, which proposed a series of solutions to address the four major challenges of high-resolution image text recognition, general document structure understanding, instruction following, and introduction of external knowledge. Without further ado, let’s look at the effects first. One-click recognition and conversion of charts with complex structures into Markdown format: Charts of different styles are available: More detailed text recognition and positioning can also be easily handled: Detailed explanations of document understanding can also be given: You know, "Document Understanding" is currently An important scenario for the implementation of large language models. There are many products on the market to assist document reading. Some of them mainly use OCR systems for text recognition and cooperate with LLM for text processing.

FP8 and lower floating point quantification precision are no longer the "patent" of H100! Lao Huang wanted everyone to use INT8/INT4, and the Microsoft DeepSpeed team started running FP6 on A100 without official support from NVIDIA. Test results show that the new method TC-FPx's FP6 quantization on A100 is close to or occasionally faster than INT4, and has higher accuracy than the latter. On top of this, there is also end-to-end large model support, which has been open sourced and integrated into deep learning inference frameworks such as DeepSpeed. This result also has an immediate effect on accelerating large models - under this framework, using a single card to run Llama, the throughput is 2.65 times higher than that of dual cards. one

Let me introduce to you the latest AIGC open source project-AnimagineXL3.1. This project is the latest iteration of the anime-themed text-to-image model, aiming to provide users with a more optimized and powerful anime image generation experience. In AnimagineXL3.1, the development team focused on optimizing several key aspects to ensure that the model reaches new heights in performance and functionality. First, they expanded the training data to include not only game character data from previous versions, but also data from many other well-known anime series into the training set. This move enriches the model's knowledge base, allowing it to more fully understand various anime styles and characters. AnimagineXL3.1 introduces a new set of special tags and aesthetics

Paper address: https://arxiv.org/abs/2307.09283 Code address: https://github.com/THU-MIG/RepViTRepViT performs well in the mobile ViT architecture and shows significant advantages. Next, we explore the contributions of this study. It is mentioned in the article that lightweight ViTs generally perform better than lightweight CNNs on visual tasks, mainly due to their multi-head self-attention module (MSHA) that allows the model to learn global representations. However, the architectural differences between lightweight ViTs and lightweight CNNs have not been fully studied. In this study, the authors integrated lightweight ViTs into the effective
