Loading UTF-8 Encoded Text into MySQL Table
When encountering non-English characters in a CSV file intended for loading into a MySQL table, ensuring the proper handling of character encoding is crucial. This becomes particularly important with the presence of non-ASCII characters, as their representation can vary based on the chosen encoding.
In your case, setting the character set of the corresponding table column to UTF-8 alone may not be sufficient. To ensure that non-English characters are accurately preserved during data load, it is essential to specify the character set explicitly during the LOAD DATA LOCAL INFILE command.
For Python, the following approach can be employed:
<code class="python">import MySQLdb # Connect to the database db = MySQLdb.connect(host="localhost", user="root", passwd="password", db="database_name") cursor = db.cursor() # Prepare the LOAD DATA statement stmt = ("LOAD DATA INFILE 'file' " "IGNORE INTO TABLE table " "CHARACTER SET UTF8 " "FIELDS TERMINATED BY ';' " "OPTIONALLY ENCLOSED BY '"' " "LINES TERMINATED BY '\n'") # Execute the statement cursor.execute(stmt) # Commit the changes db.commit()</code>
By explicitly specifying CHARACTER SET UTF8 in the statement, MySQL is instructed to interpret the data as UTF-8 encoded. This ensures that non-English characters are correctly represented and stored within the table, even if they include characters outside the ASCII range.
The above is the detailed content of How to Load UTF-8 Encoded Text into a MySQL Table Using Python?. For more information, please follow other related articles on the PHP Chinese website!