


Specifically designed for decision trees, National University of Singapore & Tsinghua University jointly propose a fast and secure new federated learning system
Federated learning is a very hot field in machine learning, which refers to the joint training of models by multiple parties without transferring data. With the development of federated learning, federated learning systems are emerging one after another, such as FATE, FedML, PaddleFL, TensorFlow-Federated and so on. However, most federated learning systems do not support federated learning training of tree models. Compared with neural networks, tree models have the characteristics of fast training, strong interpretability, and suitable for tabular data. Tree models have a wide range of application scenarios in finance, medical care, the Internet and other fields, such as advertising recommendations, stock predictions, etc.
The representative model of decision tree is Gradient Boosting Decision Tree (GBDT). Since the prediction ability of one tree is limited, GBDT trains multiple trees in series through the boosting method, and finally achieves a good prediction effect by fitting each tree to the residual of the current prediction value and label value. Representative GBDT systems include XGBoost, LightGBM, CatBoost, and ThunderGBM. Among them, XGBoost has been used by the KDD cup championship team many times. However, none of these systems support GBDT training in federated learning scenarios. Recently, researchers from the National University of Singapore and Tsinghua University proposed a new federated learning system FedTree that focuses on training tree models.
- Paper address: https://github.com/Xtra-Computing/FedTree/blob/main/FedTree_draft_paper. pdf
- Project address: https://github.com/Xtra-Computing/FedTree
FedTree system introductionFedTree architecture diagram is shown in Figure 1. There are 5 modules in total: interface, environment, framework, privacy protection and model.
Figure 1: FedTree system architecture diagram
Interface: FedTree supports two interfaces: command line interface and Python interface. Users only need to provide parameters (number of participants, federation scenario, etc.) and can run FedTree for training with a one-line command. FedTree's Python interface is compatible with scikit-learn, and you can call fit() and predict() for training and prediction.
Environment: FedTree supports simulated deployment of federated learning on a single machine and deployment of distributed federation on multiple machines study. In a stand-alone environment, FedTree supports dividing data into multiple sub-data sets, and each sub-data set is trained as a participant. In a multi-machine environment, FedTree supports each machine as a participant, and machines communicate through gRPC. At the same time, in addition to CPU, FedTree supports the use of GPU to accelerate training.
Framework: FedTree supports the training of GBDT in horizontal and vertical federated learning scenarios. In the horizontal scenario, different participants have different training samples and the same feature space. In the vertical scenario, different participants have different feature spaces and the same training samples. In order to ensure performance, in both scenarios, multiple parties participate in the training of each node. In addition, FedTree also supports ensemble learning, where participants train trees in parallel and then aggregate them to reduce communication overhead between participants.
Privacy: Since the gradient passed during training may leak information about the training data, FedTree provides different Privacy-preserving methods to further protect gradient information include homomorphic encryption (HE) and secure aggregation (SA). At the same time, FedTree provides differential privacy to protect the final trained model.
Model: Based on training a tree, FedTree supports training GBDT through boosting/bagging method /random forest. By setting different loss functions, the model trained by FedTree supports a variety of tasks, including classification and regression.
ExperimentTable 1 summarizes the AUC of different systems on a9a, breast and credit and the RMSE on abalone, the model effect of FedTree and training GBDT (XGBoost, ThunderGBM) with all data and SecureBoost (SBT) in FATE is almost identical. Moreover, the privacy protection policies SA and HE do not affect the model performance.
Table 1: Comparison of model effects of different systems
Table 2 summarizes the training time (unit: seconds) of each tree in different systems. It can be seen that FedTree is much faster than FATE, and can achieve an acceleration ratio of more than 100 times in a horizontal federated learning scenario.
Table 2: Comparison of training time for each tree in different systems
For more research details, please refer to the original FedTree paper.
The above is the detailed content of Specifically designed for decision trees, National University of Singapore & Tsinghua University jointly propose a fast and secure new federated learning system. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Written above & the author’s personal understanding: At present, in the entire autonomous driving system, the perception module plays a vital role. The autonomous vehicle driving on the road can only obtain accurate perception results through the perception module. The downstream regulation and control module in the autonomous driving system makes timely and correct judgments and behavioral decisions. Currently, cars with autonomous driving functions are usually equipped with a variety of data information sensors including surround-view camera sensors, lidar sensors, and millimeter-wave radar sensors to collect information in different modalities to achieve accurate perception tasks. The BEV perception algorithm based on pure vision is favored by the industry because of its low hardware cost and easy deployment, and its output results can be easily applied to various downstream tasks.

Common challenges faced by machine learning algorithms in C++ include memory management, multi-threading, performance optimization, and maintainability. Solutions include using smart pointers, modern threading libraries, SIMD instructions and third-party libraries, as well as following coding style guidelines and using automation tools. Practical cases show how to use the Eigen library to implement linear regression algorithms, effectively manage memory and use high-performance matrix operations.

The bottom layer of the C++sort function uses merge sort, its complexity is O(nlogn), and provides different sorting algorithm choices, including quick sort, heap sort and stable sort.

The convergence of artificial intelligence (AI) and law enforcement opens up new possibilities for crime prevention and detection. The predictive capabilities of artificial intelligence are widely used in systems such as CrimeGPT (Crime Prediction Technology) to predict criminal activities. This article explores the potential of artificial intelligence in crime prediction, its current applications, the challenges it faces, and the possible ethical implications of the technology. Artificial Intelligence and Crime Prediction: The Basics CrimeGPT uses machine learning algorithms to analyze large data sets, identifying patterns that can predict where and when crimes are likely to occur. These data sets include historical crime statistics, demographic information, economic indicators, weather patterns, and more. By identifying trends that human analysts might miss, artificial intelligence can empower law enforcement agencies

01 Outlook Summary Currently, it is difficult to achieve an appropriate balance between detection efficiency and detection results. We have developed an enhanced YOLOv5 algorithm for target detection in high-resolution optical remote sensing images, using multi-layer feature pyramids, multi-detection head strategies and hybrid attention modules to improve the effect of the target detection network in optical remote sensing images. According to the SIMD data set, the mAP of the new algorithm is 2.2% better than YOLOv5 and 8.48% better than YOLOX, achieving a better balance between detection results and speed. 02 Background & Motivation With the rapid development of remote sensing technology, high-resolution optical remote sensing images have been used to describe many objects on the earth’s surface, including aircraft, cars, buildings, etc. Object detection in the interpretation of remote sensing images

1. The historical development of multi-modal large models. The photo above is the first artificial intelligence workshop held at Dartmouth College in the United States in 1956. This conference is also considered to have kicked off the development of artificial intelligence. Participants Mainly the pioneers of symbolic logic (except for the neurobiologist Peter Milner in the middle of the front row). However, this symbolic logic theory could not be realized for a long time, and even ushered in the first AI winter in the 1980s and 1990s. It was not until the recent implementation of large language models that we discovered that neural networks really carry this logical thinking. The work of neurobiologist Peter Milner inspired the subsequent development of artificial neural networks, and it was for this reason that he was invited to participate in this project.

1. Background of the Construction of 58 Portraits Platform First of all, I would like to share with you the background of the construction of the 58 Portrait Platform. 1. The traditional thinking of the traditional profiling platform is no longer enough. Building a user profiling platform relies on data warehouse modeling capabilities to integrate data from multiple business lines to build accurate user portraits; it also requires data mining to understand user behavior, interests and needs, and provide algorithms. side capabilities; finally, it also needs to have data platform capabilities to efficiently store, query and share user profile data and provide profile services. The main difference between a self-built business profiling platform and a middle-office profiling platform is that the self-built profiling platform serves a single business line and can be customized on demand; the mid-office platform serves multiple business lines, has complex modeling, and provides more general capabilities. 2.58 User portraits of the background of Zhongtai portrait construction

Written above & The author’s personal understanding is that in the autonomous driving system, the perception task is a crucial component of the entire autonomous driving system. The main goal of the perception task is to enable autonomous vehicles to understand and perceive surrounding environmental elements, such as vehicles driving on the road, pedestrians on the roadside, obstacles encountered during driving, traffic signs on the road, etc., thereby helping downstream modules Make correct and reasonable decisions and actions. A vehicle with self-driving capabilities is usually equipped with different types of information collection sensors, such as surround-view camera sensors, lidar sensors, millimeter-wave radar sensors, etc., to ensure that the self-driving vehicle can accurately perceive and understand surrounding environment elements. , enabling autonomous vehicles to make correct decisions during autonomous driving. Head
