使用 Python 透過 ODBC 或 JDBC 存取 IRIS 資料庫-Python教學-PHP中文網

Access IRIS database with ODBC or JDBC using Python

字串問題

我正在使用 Python 透過 JDBC（或 ODBC）存取 IRIS 資料庫。我想將資料提取到 pandas 資料框中來操作資料並從中建立圖表。我在使用 JDBC 時遇到了字串處理問題。這篇文章旨在幫助其他人遇到相同的問題。或者，如果有更簡單的方法來解決這個問題，請在評論中告訴我！

我使用的是 OSX，所以我不確定我的問題有多獨特。我正在使用 Jupyter Notebooks，儘管如果您使用任何其他 Python 程式或框架，程式碼通常是相同的。

JDBC 問題

當我從資料庫取得資料時，列描述和任何字串資料都會作為資料類型java.lang.String傳回。如果列印字串數據，它將看起來像：「(p,a,i,n,i,n,t,h,e,r,e,a,r)」而不是預期的「painintherear」。

這可能是因為當使用 JDBC 取得時，資料類型 java.lang.String 的字串會作為可迭代物件或陣列傳入。如果您使用的 Python-Java 橋接器（例如 JayDeBeApi、JDBC）未一步自動將 java.lang.String 轉換為 Python str，則可能會發生這種情況。

相較之下，Python 的 str 字串表示形式將整個字串作為一個單元。當 Python 檢索普通 str（例如透過 ODBC）時，它不會拆分為單一字元。

JDBC 解決方案

要解決此問題，您必須確保 java.lang.String 正確轉換為 Python 的 str 型別。您可以在處理獲取的資料時明確處理此轉換，因此它不會被解釋為可迭代或字元列表。

有很多方法可以進行字串操作；這就是我所做的。

import pandas as pd

import pyodbc

import jaydebeapi
import jpype

def my_function(jdbc_used)

    # Some other code to create the connection goes here

    cursor.execute(query_string)

    if jdbc_used:
        # Fetch the results, convert java.lang.String in the data to Python str
        # (java.lang.String is returned "(p,a,i,n,i,n,t,h,e,r,e,a,r)" Convert to str type "painintherear"
        results = []
        for row in cursor.fetchall():
            converted_row = [str(item) if isinstance(item, jpype.java.lang.String) else item for item in row]
            results.append(converted_row)

        # Get the column names and ensure they are Python strings 
        column_names = [str(col[0]) for col in cursor.description]

        # Create the dataframe
        df = pd.DataFrame.from_records(results, columns=column_names)

        # Check the results
        print(df.head().to_string())

    else:  
        # I was also testing ODBC
        # For very large result sets get results in chunks using cursor.fetchmany(). or fetchall()
        results = cursor.fetchall()
        # Get the column names
        column_names = [column[0] for column in cursor.description]
        # Create the dataframe
        df = pd.DataFrame.from_records(results, columns=column_names)

    # Do stuff with your dataframe

登入後複製

ODBC 問題

使用 ODBC 連線時，不會傳回字串或不回傳字串。

如果您要連接到包含 Unicode 資料（例如，不同語言的名稱）的資料庫，或者您的應用程式需要儲存或檢索非 ASCII 字符，則必須確保資料在資料庫之間傳遞時保持正確編碼。資料庫和您的 Python 應用程式。

ODBC 解決方案

此代碼確保在傳送和檢索資料至資料庫時，使用 UTF-8 對字串資料進行編碼和解碼。在處理非 ASCII 字元或確保與 Unicode 資料的兼容性時，這一點尤其重要。

def create_connection(connection_string, password):
    connection = None

    try:
        # print(f"Connecting to {connection_string}")
        connection = pyodbc.connect(connection_string + ";PWD=" + password)

        # Ensure strings are read correctly
        connection.setdecoding(pyodbc.SQL_CHAR, encoding="utf8")
        connection.setdecoding(pyodbc.SQL_WCHAR, encoding="utf8")
        connection.setencoding(encoding="utf8")

    except pyodbc.Error as e:
        print(f"The error '{e}' occurred")

    return connection

登入後複製

connection.setdecoding(pyodbc.SQL_CHAR,encoding="utf8")

告訴 pyodbc 在取得 SQL_CHAR 類型（通常是固定長度字元欄位）時如何從資料庫中解碼字元資料。

connection.setdecoding(pyodbc.SQL_WCHAR,encoding="utf8")

設定 SQL_WCHAR、寬字元類型（即 Unicode 字串，例如 SQL Server 中的 NVARCHAR 或 NCHAR）的解碼。

connection.setencoding(encoding="utf8")

確保從 Python 發送到資料庫的任何字串或字元資料都將使用 UTF-8 進行編碼，
這表示Python在與資料庫通訊時會將其內部str類型（即Unicode）轉換為UTF-8位元組。

把它們放在一起

安裝 JDBC

安裝JAVA - 使用dmg

https://www.oracle.com/middleeast/java/technologies/downloads/#jdk23-mac

更新 shell 以設定預設版本

$ /usr/libexec/java_home -V
Matching Java Virtual Machines (2):
    23 (arm64) "Oracle Corporation" - "Java SE 23" /Library/Java/JavaVirtualMachines/jdk-23.jdk/Contents/Home
    1.8.421.09 (arm64) "Oracle Corporation" - "Java" /Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home
/Library/Java/JavaVirtualMachines/jdk-23.jdk/Contents/Home
$ echo $SHELL
/opt/homebrew/bin/bash
$ vi ~/.bash_profile

登入後複製

將 JAVA_HOME 加入您的路徑

export JAVA_HOME=$(/usr/libexec/java_home -v 23)
export PATH=$JAVA_HOME/bin:$PATH

登入後複製

取得 JDBC 驅動程式

https://intersystems-community.github.io/iris-driver-distribution/

將 jar 檔案放在某個地方...我把它放在 $HOME

$ ls $HOME/*.jar
/Users/myname/intersystems-jdbc-3.8.4.jar

登入後複製

範例程式碼

它假設你已經設定了 ODBC（另一天的例子，狗吃了我的筆記...）。

注意：這是對我的真實程式碼的修改。請注意變數名稱。

import os

import datetime
from datetime import date, time, datetime, timedelta

import pandas as pd
import pyodbc

import jaydebeapi
import jpype

def jdbc_create_connection(jdbc_url, jdbc_username, jdbc_password):

    # Path to JDBC driver
    jdbc_driver_path = '/Users/yourname/intersystems-jdbc-3.8.4.jar'

    # Ensure JAVA_HOME is set
    os.environ['JAVA_HOME']='/Library/Java/JavaVirtualMachines/jdk-23.jdk/Contents/Home'
    os.environ['CLASSPATH'] = jdbc_driver_path

    # Start the JVM (if not already running)
    if not jpype.isJVMStarted():
        jpype.startJVM(jpype.getDefaultJVMPath(), classpath=[jdbc_driver_path])

    # Connect to the database
    connection = None

    try:
        connection = jaydebeapi.connect("com.intersystems.jdbc.IRISDriver",
                                  jdbc_url,
                                  [jdbc_username, jdbc_password],
                                  jdbc_driver_path)
        print("Connection successful")
    except Exception as e:
        print(f"An error occurred: {e}")

    return connection


def odbc_create_connection(connection_string):
    connection = None

    try:
        # print(f"Connecting to {connection_string}")
        connection = pyodbc.connect(connection_string)

        # Ensure strings are read correctly
        connection.setdecoding(pyodbc.SQL_CHAR, encoding="utf8")
        connection.setdecoding(pyodbc.SQL_WCHAR, encoding="utf8")
        connection.setencoding(encoding="utf8")

    except pyodbc.Error as e:
        print(f"The error '{e}' occurred")

    return connection

# Parameters

odbc_driver = "InterSystems ODBC"
odbc_host = "your_host"
odbc_port = "51773"
odbc_namespace = "your_namespace"
odbc_username = "username"
odbc_password = "password"

jdbc_host = "your_host"
jdbc_port = "51773"
jdbc_namespace = "your_namespace"
jdbc_username = "username"
jdbc_password = "password"

# Create connection and create charts

jdbc_used = True

if jdbc_used:
    print("Using JDBC")
    jdbc_url = f"jdbc:IRIS://{jdbc_host}:{jdbc_port}/{jdbc_namespace}?useUnicode=true&characterEncoding=UTF-8"
    connection = jdbc_create_connection(jdbc_url, jdbc_username, jdbc_password)
else:
    print("Using ODBC")
    connection_string = f"Driver={odbc_driver};Host={odbc_host};Port={odbc_port};Database={odbc_namespace};UID={odbc_username};PWD={odbc_password}"
    connection = odbc_create_connection(connection_string)


if connection is None:
    print("Unable to connect to IRIS")
    exit()

cursor = connection.cursor()

site = "SAMPLE"
table_name = "your.TableNAME"

desired_columns = [
    "RunDate",
    "ActiveUsersCount",
    "EpisodeCountEmergency",
    "EpisodeCountInpatient",
    "EpisodeCountOutpatient",
    "EpisodeCountTotal",
    "AppointmentCount",
    "PrintCountTotal",
    "site",
]

# Construct the column selection part of the query
column_selection = ", ".join(desired_columns)

query_string = f"SELECT {column_selection} FROM {table_name} WHERE Site = '{site}'"

print(query_string)
cursor.execute(query_string)

if jdbc_used:
    # Fetch the results
    results = []
    for row in cursor.fetchall():
        converted_row = [str(item) if isinstance(item, jpype.java.lang.String) else item for item in row]
        results.append(converted_row)

    # Get the column names and ensure they are Python strings (java.lang.String is returned "(p,a,i,n,i,n,t,h,e,a,r,s,e)"
    column_names = [str(col[0]) for col in cursor.description]

    # Create the dataframe
    df = pd.DataFrame.from_records(results, columns=column_names)
    print(df.head().to_string())
else:
    # For very large result sets get results in chunks using cursor.fetchmany(). or fetchall()
    results = cursor.fetchall()
    # Get the column names
    column_names = [column[0] for column in cursor.description]
    # Create the dataframe
    df = pd.DataFrame.from_records(results, columns=column_names)

    print(df.head().to_string())

# # Build charts for a site
# cf.build_7_day_rolling_average_chart(site, cursor, jdbc_used)

cursor.close()
connection.close()

# Shutdown the JVM (if you started it)
# jpype.shutdownJVM()

登入後複製

以上是使用 Python 透過 ODBC 或 JDBC 存取 IRIS 資料庫的詳細內容。更多資訊請關注PHP中文網其他相關文章！