Integrating large language models into real-world product backends is the latest battleground for innovation. But as with all tech trends, the real winners aren’t those who rush to know it all. Instead, it's about pausing, reflecting, and making smart decisions.
Even though AI is more accessible than ever, thinking it's a trivial task is a big mistake. The field is still in its infancy, and almost everyone in tech and business is trying to figure out how to make sense of it. The internet is overflowing with both reliable information and misleading hype.
Believe it or not, you don’t need to jump on every piece of AI gossip you hear. Take a step back and approach it thoughtfully.
If you’re a PHP developer or work with a PHP codebase, AI might feel like an alien concept. Classes, interfaces, message queues, and frameworks seem worlds apart from NLP, fine-tuning, stochastic gradient descents, LoRA, RAG, and all that jargon. I get it. To systematically learn these concepts, like anything in software engineering, we need time and good practice.
But isn’t AI, machine learning, and data science the domain of Python or R programmers? You’re partially right. Most foundational machine learning is done in Python. And honestly, Python is a joy to learn—it’s versatile and can be applied to countless interesting projects. From building and training a language model to deploying it via an API, Python has you covered.
But let’s get back to PHP.
Don’t worry—you’re not about to become obsolete as a regular software engineer. Quite the opposite. As we’re increasingly expected to understand related fields like containerization, CI/CD, systems engineering, and cloud infrastructure, AI is quickly becoming another essential skill in our toolkit. There’s no better time than now to start learning.
That said, I don’t recommend diving headfirst into building neural networks from scratch (unless you really want to). It’s easy to get overwhelmed. Instead, let me show you a practical starting point for your AI experiments.
Here’s what we’ll cover in this guide:
Asynchronous AI Workflows
I’ll show you how to implement an AI workflow using a message queue—whether it’s RabbitMQ, Amazon SQS, or your preferred broker.
Production-Ready Solution
You’ll see a real example of the solution deployed in a production system that solves basic business requirement with AI
Symfony Integration
The solution is fully implemented within the Symfony PHP framework.
OpenAI Tools
We’ll use the OpenAI PHP library and the latest OpenAI Assistants 2.0.
Dataset Structure
How to build a dataset for training and evaluating your AI model.
Fine-Tuning OpenAI Models
Learn to prepare a proper .jsonl file and fine-tune the GPT-4o-mini model or another one from GPT family for your specific business use case.
Creating and Testing an OpenAI Assistant 2.0
Understand how to set up an OpenAI Assistant and test it in the OpenAI Playground.
Knowledge Bases
Dive into the concept of knowledge bases: why GPT doesn’t know everything and how to supply it with the right context to significantly boost accuracy.
PHP&Symfony Integration
Learn how to seamlessly connect your AI agent with your Symfony application.
Interested? Let's roll.
Let’s dive into the problem we’re solving.
At its core, we’re dealing with a block of text that represents something—an address, in our case. The goal is to classify its components into predefined groups.
So, when a user sends us an address, we aim to return a JSON structure that segments the address into its parts. For example:
{ "street": "Strada Grui", "block": "Bloc 7", "staircase": "Scara A", "apartment": "Apartament 6", "city": "Zărnești" }
Why Does This Matter?
Well, these addresses are typed by people—our customers—and they’re often inconsistent. Here’s why we need structured, parsed data:
To achieve this, we dissect the address into predefined groups like "street," "house," "postcode," etc., and then reassemble it into the desired sequence.
Why Not Use Regular Expressions?
At first glance, it sounds straightforward. If we enforce formatting for new customers or know they typically write addresses in a specific way, regular expressions might seem like a reasonable solution.
However, let’s consider examples of Romanian addresses:
STRADA GRUI, BLOC 7, SCARA A, APARTAMENT 6, ZĂRNEŞTI BD. 21 DECEMBRIE 1989 nr. 93 ap. 50, CLUJ, 400124
Romanian addresses are often complex, written in no particular order, and frequently omit elements like the postcode. Even the most sophisticated regular expressions struggle to handle such variability reliably.
This is where AI models like GPT-3.5 Turbo or GPT-4o-mini shine—they can manage inconsistencies and complexities far beyond what static rules like regex can handle.
Yes, we’re developing an AI workflow that significantly improves on the traditional approach.
Every machine learning project boils down to two essentials: data and model. And while models are important, data is far more critical.
Models come pre-packaged, tested, and can be swapped out to see which one performs better. But the true game-changer is the quality of the data we provide to the model.
The Role of Data in Machine Learning
Typically, we split our dataset into two or three parts:
For this project, we want to split an address into its components—making this a classification task. To get good results with a large language model (LLM) like GPT, we need at least 100 examples in our training dataset.
The raw format of our dataset doesn’t matter much, but I’ve chosen a structure that’s intuitive and easy to use:
{ "street": "Strada Grui", "block": "Bloc 7", "staircase": "Scara A", "apartment": "Apartament 6", "city": "Zărnești" }
Here’s an example:
STRADA GRUI, BLOC 7, SCARA A, APARTAMENT 6, ZĂRNEŞTI BD. 21 DECEMBRIE 1989 nr. 93 ap. 50, CLUJ, 400124
As you can see, the goal is to produce a structured JSON response based on the input address.
First, you’ll need an OpenAI API account. It’s a straightforward process, and I recommend adding some initial funds—$10 or $20 is plenty to get started. OpenAI uses a convenient pre-paid subscription model, so you’ll have full control over your spending.
You can manage your account billing here:
Exploring the OpenAI Playground
Once your account is set up, head over to the OpenAI Playground.
Take a moment to familiarize yourself with the interface. When you’re ready, look for the "Assistants" section in the menu on the left.
This is where we’ll create our custom GPT instance.
Customizing Your GPT Assistant
We’ll tailor our GPT assistant to meet our specific needs in two stages:
Detailed System Instructions with Examples
This step helps us quickly test and validate whether the solution works. It’s the fastest way to see results.
Fine-Tuning with Knowledge Base (simple RAG)
Once we’re satisfied with the initial results, we’ll fine-tune the model. This process reduces the need to provide extensive examples in every prompt, which, in turn, decreases inference time (how long it takes for the model to respond via the API).
Both steps combined give us the most accurate and efficient results.
So, let's begin.
For our Named Entity Recognition system, we need the model to output structured JSON responses consistently. Crafting a thoughtful system prompt is critical to achieving this.
Here’s what we’ll focus on:
The "few-shot" technique is a powerful approach for this task. It works by providing the model with a few examples of the desired input-output relationship. From these examples, the model can generalize and handle inputs it hasn’t seen before.
Key Elements of a Few-Shot Prompt:
The prompt consists of several parts, which should be thoughtfully structured for best results. While the exact order can vary, the following sections are essential to include:
The first part of the prompt provides a general overview of what we expect the model to accomplish.
For example, in this use case, the input is a Romanian address, which may include spelling mistakes and incorrect formatting. This context is essential as it sets the stage for the model, explaining the kind of data it will process.
After defining the task, we guide the AI on how to format its output.
Here, the JSON structure is explained in detail, including how the model should derive each key-value pair from the input. Examples are included to clarify expectations. Additionally, any special characters are properly escaped to ensure valid JSON.
Examples are the backbone of the "few-shot" technique. The more relevant examples we provide, the better the model’s performance.
Thanks to GPT’s extensive context window (up to 16K tokens), we can include a large number of examples.
To build the example set:
Start with a basic prompt and evaluate the model’s output manually.
If the output contains errors, correct them and add the fixed version to the example set.
This iterative process improves the assistant’s performance over time.
It’s inevitable—your assistant will make mistakes. Sometimes, it might repeatedly make the same error.
To address this, include clear and polite corrective information in your prompt. Clearly explain:
What mistakes the assistant is making.
What you expect from the output instead.
This feedback helps the model adjust its behavior and generate better results.
Now that we’ve crafted the initial prompt for our proof-of-concept few-shot solution, it’s time to configure and test the Assistant.
Step1
Step 2: Configure Your Assistant
Step 3: Choose the Right Model
For this project, I chose GPT-4o-mini because:
That said, you should always select your model based on a balance between accuracy and cost. Run or search for benchmarks to determine what works best for your specific task.
With the initial configuration in place, we can now specify the output schema directly in the Assistant creator.
Step 1: Using the "Generate" Option
Instead of creating the schema manually, I used the "Generate" option provided by the Assistant creator. Here’s how:
The tool does a great job of automatically generating a schema that matches your structure.
--
To ensure consistent and predictable responses, set the temperature as low as possible.
What is Temperature?
Temperature controls the randomness of the model’s responses. A low temperature value means the model produces more predictable and deterministic outputs.
For our use case, this is exactly what we want. When given the same address as input, the model should always return the same, correct response. Consistency is key for reliable results.
With all the parameters in place, head to the Playground to test your solution.
The Playground provides a console where you can put your Assistant to work. This is where the fun begins—you can rigorously test your bare-bones solution to uncover:
These findings will help you refine the Corrective Information section of your prompt, making your Assistant more robust.
Below is an example of a result from one of my manual tests:
{ "street": "Strada Grui", "block": "Bloc 7", "staircase": "Scara A", "apartment": "Apartament 6", "city": "Zărnești" }
Why Manual Testing Matters ?
Old-school, hands-on testing is the foundation of building a reliable solution. By evaluating the model’s outputs manually, you’ll quickly spot issues and understand how to fix them. While automation will come later, manual testing is an invaluable step in creating a solid proof-of-concept.
Now it’s time to integrate everything into your PHP Symfony application. The setup is straightforward and follows a classic asynchronous architecture.
Here’s a breakdown of the flow:
1.Frontend Interaction
In general, what we have here is a classic asynchronous setup with message queue. We have 2 instances of Symfony application running inside Docker container. The first one is interacting with the frontend client.
For example, when a customer types:
STRADA GRUI, BLOC 7, SCARA A, APARTAMENT 6, ZĂRNEŞTI BD. 21 DECEMBRIE 1989 nr. 93 ap. 50, CLUJ, 400124
The application processes the input address and packages it into a Message object.
The Message object is then wrapped in a Symfony Messenger Envelope instance. The message is serialized into JSON format, with additional metadata for processing.
Symfony Messenger is perfect for handling asynchronous tasks. It allows us to offload time-consuming operations, like calling the OpenAI API, to background processes.
This approach ensures:
Below is the message class used for our system:
{ "street": "Strada Grui", "block": "Bloc 7", "staircase": "Scara A", "apartment": "Apartament 6", "city": "Zărnești" }
The application connects to a RabbitMQ instance using the amqp PHP extension. To set this up, you need to define the transport and message binding in your messenger.yaml configuration file.
Refer to the official Symfony Messenger documentation for detailed guidance:
Symfony Messenger Transport Configuration
Documentation: https://symfony.com/doc/current/messenger.html#transport-configuration
Once the message is pushed to the broker (e.g., RabbitMQ, AmazonMQ, or AWS SQS), it’s picked up by the second instance of the application. This instance runs the messenger daemon to consume messages, as marked 3 in the architecture schema.
The consuming process is handled by running:
bin/console messenger:consume
is working as it's process.
The daemon picks up the message from the configured queue, deserializes it back into the corresponding Message class, and forwards it to the Message Handler for processing.
Here’s the Message Handler where the interaction with the OpenAI Assistant occurs:
STRADA GRUI, BLOC 7, SCARA A, APARTAMENT 6, ZĂRNEŞTI BD. 21 DECEMBRIE 1989 nr. 93 ap. 50, CLUJ, 400124
Key Points of the Handler
Logging
Logs the start of processing for traceability and debugging.
Normalization Service
The OpenAIAddressNormalizationService is called to process the input address via the Assistant.
Persistence
The normalized address is saved in the database using Doctrine's EntityManager.
The message handler may seem like standard Symfony code, but the core is located in this line:
INPUT: <what goes into the LLM> OUTPUT: <what the LLM should produce>
This service is responsible for interacting with OpenAI via the designated PHP library.
OpenAI PHP Client
Let’s dive into the implementation of the service with the usage of PHP client.
Here’s the full implementation of the OpenAIAddressNormalizationService:
input: STRADA EREMIA GRIGORESCU, NR.11 BL.45B,SC.B NR 11/38, 107065 PLOIESTI output: {{"street": "Eremia Grigorescu", "house_number": "11", "flat_number": "38", "block": "45B", "staircase": "45B", floor: "", "apartment": "", "landmark": "", "postcode": "107065", "county": "Prahova", 'commune': '', 'village': '', "city" "Ploiesti"}}
The workflow for getting an inference (or completion)—essentially the response from GPT—using the openai-php/client library follows three main stages:
The first step is setting up the OpenAI client and retrieving the configured Assistant:
INPUT: BD. 21 DECEMBRIE 1989 nr. 93 ap. 50, CLUJ, 400124 OUTPUT: { "street": "Bdul 21 Decembrie 1989", "house": "93", "flat": "50", "block": "", "staircase": "", "floor": "", "apartment": "", "landmark": "", "intercom": "", "postcode": "400124", "county": "Cluj", "commune": "", "village": "", "city": "Cluj" }
Client Initialization: The OpenAI::client() method creates a new client using your API key and organization ID.
Assistant Retrieval: The retrieve() method connects to the specific Assistant instance configured for your task, identified by its unique ID.
Once the client and Assistant are initialized, you can create and run a thread to begin the interaction. A thread acts as the communication pipeline, handling the exchange of messages between the user and the Assistant.
Here’s how the thread is initiated:
{ "street": "Strada Grui", "block": "Bloc 7", "staircase": "Scara A", "apartment": "Apartament 6", "city": "Zărnești" }
After initiating the thread, the application handles the response. Since OpenAI processes may not complete instantly, you need to poll the thread’s status until it’s marked as 'completed':
STRADA GRUI, BLOC 7, SCARA A, APARTAMENT 6, ZĂRNEŞTI BD. 21 DECEMBRIE 1989 nr. 93 ap. 50, CLUJ, 400124
Polling: The application repeatedly checks the thread’s status (queued or in_progress). Once the status changes to 'completed', the thread contains the final output.
Once the thread is complete, the application retrieves the response messages to extract the normalized address:
INPUT: <what goes into the LLM> OUTPUT: <what the LLM should produce>
The normalized address can now be returned to the frontend client for immediate use or stored in the database as a structured entity, like CustomerAddressNormalized, for future processing.
When you run this setup, you should be able to extract and save structured output from OpenAI for Named Entity Recognition and other classification or generation tasks.
In some cases, basic AI solutions aren’t enough. When regulatory compliance and business requirements demand high accuracy, we need to go the extra mile to ensure the output is factual and reliable.
For example, when generating a JSON structure, we need to guarantee that its contents align with reality. A common risk is the model hallucinating information—like inventing a place based on a provided postal code. This can cause serious issues, particularly in high-stakes environments.
Ground Truth with Knowledge Base
To eliminate hallucinations and ensure factual accuracy, we need to supply the Assistant with a ground truth knowledge base. This acts as the definitive reference for the model, ensuring it uses accurate information during inference.
My Approach: A Postal Code Knowledge Base
I created a simple (but quite large—around 12 MB) JSON file containing the full demarcation of all postal codes in Romania. This dictionary-like structure provides:
Postal Code: The input value.
Validated Information: Facts like the corresponding city, county, and overall place name.
This knowledge base serves as a reference point for the Assistant when performing Named Entity Recognition.
Structure of the Knowledge Base
Here’s an example snippet of the knowledge base JSON structure:
{ "street": "Strada Grui", "block": "Bloc 7", "staircase": "Scara A", "apartment": "Apartament 6", "city": "Zărnești" }
With the knowledge base ready, it’s time to instruct the Assistant to use it effectively. While you could embed this directly into the prompt, this approach increases token usage and costs. This makes it an excellent moment to introduce fine-tuning.
Fine-tuning involves modifying the outermost layers (specifically the weight matrices) of a pre-trained model, adapting it for a specific task. In our case, Named Entity Recognition (NER) is a perfect candidate.
A fine-tuned model:
The main goal is to align the model more closely with the nuances of the real-world data it will process. Training on domain-specific data makes the model better at understanding and generating appropriate responses. It also reduces the need for post-processing or error correction by the application.
To fine-tune the model, we need a properly formatted dataset in .jsonl (JSON Lines) format, as required by OpenAI. Each entry in the dataset includes:
This dataset provides the Assistant with domain-specific examples, enabling it to learn how to respond to similar prompts in the future.
Here’s an example of how to structure a fine-tuning entry:
STRADA GRUI, BLOC 7, SCARA A, APARTAMENT 6, ZĂRNEŞTI BD. 21 DECEMBRIE 1989 nr. 93 ap. 50, CLUJ, 400124
Let’s break down the structure of a fine-tuning entry in the .jsonl file, using our Romanian address processing case as an example:
Each entry is designed to simulate a real interaction between the user and the Assistant, encapsulating:
This approach helps the model learn the desired behavior in real-world contexts.
1. System Role Message
Describes the assistant’s capabilities and the scope of its functionality, setting expectations for the types of entities it should recognize and extract, such as street names, house numbers, and postal codes.
Example: The system explains that the assistant is designed to function as an Named Entity Recognition model for Romanian addresses, detailing the components it should extract and classify.
2. User Role Message
3. Assistant Role Message
Once you’ve created the training file, the next step is to prepare a validation file. This file evaluates the accuracy of the fine-tuned Assistant on real data. The structure is similar to the training file, which makes automating the creation of both files more straightforward.
The validation file is designed to test the model’s generalization capabilities. It ensures that the fine-tuned Assistant can handle new inputs and perform consistently, even when faced with unfamiliar or challenging examples.
System Message:
To maintain consistency with the training process, the system message should be identical to the one used in the training file.
User Message:
Introduces a new input (e.g., an address) that was not included in the training file. The new example should be realistic and domain-specific, as well as challenge the model by including minor errors or complexities.
Assistant Message:
Provides the expected structured response for the new user input.
This serves as the gold standard for validating the model’s accuracy.
To streamline the creation of both training and validation files, automation scripts can be used. These scripts generate properly formatted .jsonl files based on input datasets.
Visit the repository containing scripts to generate training and validation files:
Automated .jsonl generation scripts
It's a Python version of fine-tuning automation. PHP version is coming soon.
Why Validation Matters?
If you prefer, you can fine-tune the Assistant manually using OpenAI's graphical user interface (GUI). Once you’ve prepared both the training and validation files, follow these steps to get started:
Step 1: Access the Fine-Tuning Wizard
Step 2: Configure Fine-Tuning
In the fine-tuning menu, update the following settings:
Step 3: Monitor the Fine-Tuning Process
Once the fine-tuning process starts, you can track its progress in the Fine-Tuning Dashboard:
The dashboard provides real-time updates and displays several metrics to help you monitor and evaluate the training process.
Key Metrics to Understand
Training loss Measures how well the model is fitting the training data. A lower training loss indicates the model is effectively learning patterns within the dataset.
Full Validation Loss Indicates performance on unseen data from the validation dataset. A lower validation loss suggests the model will generalize well to new inputs.
Steps and Time Training steps are number of iterations where model weights are updated based on batches of data.
The time stamp indicates when each step was processed, which helps to monitor the duration of training and intervals between evaluations.
Interpreting the Metrics
Monitoring below metrics helps you determine whether the fine-tuning process is converging correctly or requires adjustments.
Underfitting: Happens when both losses remain high, showing the model isn’t learning effectively.
With your Assistant trained and fine-tuned, it’s time to integrate it into your application and start evaluating its performance.
Start Simple.
Begin with manual tests, either in the Assistants Playground or directly within your application. Avoid overcomplicating your evaluation at this stage; focus on ensuring the basics work as expected.
For example, you can use a simple tool like Google Sheets to manually compare the input and output, as shown here:
This is the first step.
AI and ML solutions require ongoing evaluation to ensure performance remains consistent. For tasks like Named Entity Recognition, automated tools such as the PromptFlow extension for Visual Studio Code can help streamline testing and validation.
References
OpenAI Fine-Tuning Documentation
Symfony Messenger Documentation
OpenAI PHP Client Library
PromptFlow Extension for VS Code
Thank You!
Thank you for taking the time to explore this guide! I hope it provides you with a solid foundation to build and fine-tune your AI-powered solutions. If you have any questions or suggestions, feel free to reach out or leave a comment.
Happy coding and best of luck with your AI projects! ?
The above is the detailed content of Harnessing OpenAI Assistant for Named Entity Recognition in PHP/Symfony 7. For more information, please follow other related articles on the PHP Chinese website!