Table of Contents
Question content
Solution
Home Backend Development Golang Convert dot-separated values ​​to Go struct using Python

Convert dot-separated values ​​to Go struct using Python

Feb 10, 2024 pm 01:33 PM
go language

使用 Python 将点分隔值转换为 Go 结构

php editor Youzi will introduce in this article how to use Python to convert dot-separated values ​​(such as "key1.subkey1.subkey2") into a structure in the Go language. This transformation process is useful for extracting and processing data from configuration files or API responses. We will use Python's recursive functions and Go language structures to implement this conversion, and give detailed code examples and explanations. After studying this article, readers will be able to easily process and convert point-separated values, improving the efficiency and flexibility of data processing.

Question content

This is a specific requirement for an application that can change the configuration (specifically the wso2 identity server, since I'm using go to write the kubernetes operator for it). But it's really not relevant here. I want to create a solution that allows easy management of large number of configuration maps to generate go structures. These configurations are mapped in .csv

Link to .csv - my_configs.csv

I want, Write a python script that automatically generates go structures so that any changes to the application configuration can be updated by simply executing the python script to create the corresponding go structure. I'm referring to the configuration of the application itself. For example, toml key names in csv can be changed/new values ​​can be added.

So far I have successfully created a python script that almost achieves my goal. The script is,

import pandas as pd

def convert_to_dict(data):
    result = {}
    for row in data:
        current_dict = result
        for item in row[:-1]:
            if item is not none:
                if item not in current_dict:
                    current_dict[item] = {}
                current_dict = current_dict[item]
    return result

def extract_json_key(yaml_key):
    if isinstance(yaml_key, str) and '.' in yaml_key:
        return yaml_key.split('.')[-1]
    else:
        return yaml_key

def add_fields_to_struct(struct_string,go_var,go_type,json_key,toml_key):
    struct_string += str(go_var) + " " + str(go_type) + ' `json:"' + str(json_key) + ',omitempty" toml:"' +str(toml_key) + '"` ' + "\n"
    return struct_string

def generate_go_struct(struct_name, struct_data):
    struct_name="configurations" if struct_name == "" else struct_name
    struct_string = "type " + struct_name + " struct {\n"
    yaml_key=df['yaml_key'].str.split('.').str[-1]
    
    # base case: generate fields for the current struct level    
    for key, value in struct_data.items():
        selected_rows = df[yaml_key == key]

        if len(selected_rows) > 1:
            go_var = selected_rows['go_var'].values[1]
            toml_key = selected_rows['toml_key'].values[1]
            go_type=selected_rows['go_type'].values[1]
            json_key=selected_rows['json_key'].values[1]
        else:
            go_var = selected_rows['go_var'].values[0]
            toml_key = selected_rows['toml_key'].values[0]
            go_type=selected_rows['go_type'].values[0]
            json_key=selected_rows['json_key'].values[0]

        # add fields to the body of the struct
        struct_string=add_fields_to_struct(struct_string,go_var,go_type,json_key,toml_key)   

    struct_string += "}\n\n"
    
    # recursive case: generate struct definitions for nested structs
    for key, value in struct_data.items():
        selected_rows = df[yaml_key == key]

        if len(selected_rows) > 1:
            go_var = selected_rows['go_var'].values[1]
        else:
            go_var = selected_rows['go_var'].values[0]

        if isinstance(value, dict) and any(isinstance(v, dict) for v in value.values()):
            nested_struct_name = go_var
            nested_struct_data = value
            struct_string += generate_go_struct(nested_struct_name, nested_struct_data)
    
    return struct_string

# read excel
csv_file = "~/downloads/my_configs.csv"
df = pd.read_csv(csv_file)

# remove rows where all columns are nan
df = df.dropna(how='all')
# create the 'json_key' column using the custom function
df['json_key'] = df['yaml_key'].apply(extract_json_key)

data=df['yaml_key'].values.tolist() # read the 'yaml_key' column
data = pd.dataframe({'column':data}) # convert to dataframe

data=data['column'].str.split('.', expand=true) # split by '.'

nested_list = data.values.tolist() # convert to nested list
data=nested_list 

result_json = convert_to_dict(data) # convert to dict (json)

# the generated co code
go_struct = generate_go_struct("", result_json)

# write to file
file_path = "output.go"
with open(file_path, "w") as file:
    file.write(go_struct)
Copy after login

The problem is (see below part of csv),

authentication.authenticator.basic
authentication.authenticator.basic.parameters
authentication.authenticator.basic.parameters.showAuthFailureReason
authentication.authenticator.basic.parameters.showAuthFailureReasonOnLoginPage
authentication.authenticator.totp
authentication.authenticator.totp.parameters
authentication.authenticator.totp.parameters.showAuthFailureReason
authentication.authenticator.totp.parameters.showAuthFailureReasonOnLoginPage
authentication.authenticator.totp.parameters.encodingMethod
authentication.authenticator.totp.parameters.timeStepSize
Copy after login

Here, since the basic and totp fields parameters are duplicated, the script confuses itself and generates two totpparameters structures. The expected result is to have basicparameters and totpparameters structures. There are many similar duplicate words in the yaml_key column of the csv.

I know this has to do with the index being hardcoded to 1 in go_var = selected_rows['go_var'].values[1], but it's hard to fix this.

Can someone give me an answer? I think,

  1. Problems with recursive functions
  2. There is a problem with the code that generates json may be the root cause of this issue.

Thanks!

I also tried using chatgpt, but since this has to do with nesting and recursion, the answer provided by chatgpt is not very efficient.

renew

I found a problem with the row containing the properties, pooloptions, endpoint and parameters fields. This is because they are duplicated in the yaml_key column.

Solution

I was able to resolve this issue. However, I had to use a completely new approach to the problem, which was to use a tree data structure and then iterate over it. This is the main logic behind it - https://www.geeksforgeeks.org/level-sequential tree traversal/

This is the working python code.

import pandas as pd
from collections import deque

structs=[]
class TreeNode:
    def __init__(self, name):
        self.name = name
        self.children = []
        self.path=""

    def add_child(self, child):
        self.children.append(child)

def create_tree(data):
    root = TreeNode('')
    for item in data:
        node = root
        for name in item.split('.'):
            existing_child = next((child for child in node.children if child.name == name), None)
            if existing_child:
                node = existing_child
            else:
                new_child = TreeNode(name)
                node.add_child(new_child)
                node = new_child
    return root

def generate_go_struct(struct_data):
    struct_name = struct_data['struct_name']
    fields = struct_data['fields']
    
    go_struct = f"type {struct_name} struct {{\n"

    for field in fields:
        field_name = field['name']
        field_type = field['type']
        field_default_val = str(field['default_val'])
        json_key=field['json_key']
        toml_key=field['toml_key']

        tail_part=f"\t{field_name} {field_type} `json:\"{json_key},omitempty\" toml:\"{toml_key}\"`\n\n"

        if pd.isna(field['default_val']):
            go_struct += tail_part
        else:
            field_default_val = "\t// +kubebuilder:default:=" + field_default_val
            go_struct += field_default_val + "\n" + tail_part

    go_struct += "}\n\n"
    return go_struct

def write_go_file(go_structs, file_path):
    with open(file_path, 'w') as file:
        for go_struct in go_structs:
            file.write(go_struct)

def create_new_struct(struct_name):
    struct_name = "Configurations" if struct_name == "" else struct_name
    struct_dict = {
        "struct_name": struct_name,
        "fields": []
    }
    
    return struct_dict

def add_field(struct_dict, field_name, field_type,default_val,json_key, toml_key):
    field_dict = {
        "name": field_name,
        "type": field_type,
        "default_val": default_val,
        "json_key":json_key,
        "toml_key":toml_key
    }
    struct_dict["fields"].append(field_dict)
    
    return struct_dict

def traverse_tree(root):
    queue = deque([root])  
    while queue:
        node = queue.popleft()
        filtered_df = df[df['yaml_key'] == node.path]
        go_var = filtered_df['go_var'].values[0] if not filtered_df.empty else None
        go_type = filtered_df['go_type'].values[0] if not filtered_df.empty else None

        if node.path=="":
            go_type="Configurations"

        # The structs themselves
        current_struct = create_new_struct(go_type)
        
        for child in node.children:  
            if (node.name!=""):
                child.path=node.path+"."+child.name   
            else:
                child.path=child.name

            filtered_df = df[df['yaml_key'] == child.path]
            go_var = filtered_df['go_var'].values[0] if not filtered_df.empty else None
            go_type = filtered_df['go_type'].values[0] if not filtered_df.empty else None
            default_val = filtered_df['default_val'].values[0] if not filtered_df.empty else None

            # Struct fields
            json_key = filtered_df['yaml_key'].values[0].split('.')[-1] if not filtered_df.empty else None
            toml_key = filtered_df['toml_key'].values[0].split('.')[-1] if not filtered_df.empty else None
            
            current_struct = add_field(current_struct, go_var, go_type,default_val,json_key, toml_key)

            if (child.children):
                # Add each child to the queue for processing
                queue.append(child)

        go_struct = generate_go_struct(current_struct)
        # print(go_struct,"\n")        
        structs.append(go_struct)

    write_go_file(structs, "output.go")

csv_file = "~/Downloads/my_configs.csv"
df = pd.read_csv(csv_file) 

sample_data=df['yaml_key'].values.tolist()

# Create the tree
tree = create_tree(sample_data)

# Traverse the tree
traverse_tree(tree)

Copy after login

Thank you for your help!

The above is the detailed content of Convert dot-separated values ​​to Go struct using Python. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

What is the problem with Queue thread in Go's crawler Colly? What is the problem with Queue thread in Go's crawler Colly? Apr 02, 2025 pm 02:09 PM

Queue threading problem in Go crawler Colly explores the problem of using the Colly crawler library in Go language, developers often encounter problems with threads and request queues. �...

What libraries are used for floating point number operations in Go? What libraries are used for floating point number operations in Go? Apr 02, 2025 pm 02:06 PM

The library used for floating-point number operation in Go language introduces how to ensure the accuracy is...

How to solve the user_id type conversion problem when using Redis Stream to implement message queues in Go language? How to solve the user_id type conversion problem when using Redis Stream to implement message queues in Go language? Apr 02, 2025 pm 04:54 PM

The problem of using RedisStream to implement message queues in Go language is using Go language and Redis...

In Go, why does printing strings with Println and string() functions have different effects? In Go, why does printing strings with Println and string() functions have different effects? Apr 02, 2025 pm 02:03 PM

The difference between string printing in Go language: The difference in the effect of using Println and string() functions is in Go...

What should I do if the custom structure labels in GoLand are not displayed? What should I do if the custom structure labels in GoLand are not displayed? Apr 02, 2025 pm 05:09 PM

What should I do if the custom structure labels in GoLand are not displayed? When using GoLand for Go language development, many developers will encounter custom structure tags...

What is the difference between `var` and `type` keyword definition structure in Go language? What is the difference between `var` and `type` keyword definition structure in Go language? Apr 02, 2025 pm 12:57 PM

Two ways to define structures in Go language: the difference between var and type keywords. When defining structures, Go language often sees two different ways of writing: First...

Which libraries in Go are developed by large companies or provided by well-known open source projects? Which libraries in Go are developed by large companies or provided by well-known open source projects? Apr 02, 2025 pm 04:12 PM

Which libraries in Go are developed by large companies or well-known open source projects? When programming in Go, developers often encounter some common needs, ...

Why is it necessary to pass pointers when using Go and viper libraries? Why is it necessary to pass pointers when using Go and viper libraries? Apr 02, 2025 pm 04:00 PM

Go pointer syntax and addressing problems in the use of viper library When programming in Go language, it is crucial to understand the syntax and usage of pointers, especially in...

See all articles