In this hands-on guide, I’ll use the DeepSeek-R1 model to build a math puzzle solver assistant integrated with EasyOCR and Gradio.
I’ll explain step-by-step how to build a functional web app capable of solving a wide range of mathematical puzzles and generating helpful solutions using the excellent reasoning capabilities of the DeepSeek R1 model.
If you only want to get an overview of DeepSeek-R1, I recommend checking out this DeepSeek-R1 overview. To fine-tune the model, I recommend this tutorial on fine-tuning DeepSeek-R1.
To build our puzzle solver assistant, we’ll go over the following steps:
Before diving into the implementation, let’s ensure that we have the following tools and libraries installed:
Run the following commands to install the necessary dependencies:
!pip install torch gradio pillow easyocr -q
Once the above dependencies are installed, run the following import commands:
Import torch from PIL import Image import easyocr import requests import json import gradio as gr
The following script demonstrates how to interact with the DeepSeek API to obtain responses based on user prompts. Note that DeepSeek's API is compatible with OpenAI's format and uses a base URL for API requests.
You can either directly pass in the API key (not recommended for privacy reasons), or if using Google Colab like me, you can save the API key using the Secrets feature. Alternatively, you can use environment variables.
# DeepSeek API configuration DEEPSEEK_API_URL = "https://api.deepseek.com/v1/chat/completions" # If you're using Colab and storing your key in the Secrets tab: from google.colab import userdata API_KEY = userdata.get('SECRET_KEY') # If you are running this code elsewhere then, replace 'YOUR_API_KEY' with your actual DeepSeek API key. Uncomment the following line of code. #API_KEY = 'YOUR_API_KEY'
At the time of publishing this article, DeepSeek’s services are under heavy load, and their performance is degraded—I’ve also had major difficulties running the code for this project. Please check DeepSeek’s status page before attempting to run the code in this project.
Now the API is set, we can work on the code features. In this section, we’ll process an image containing a logic puzzle, extract the puzzle text using OCR, refine the text, and send it to the DeepSeek API for solving. Let’s first see the code, and then I’ll explain it.
reader = easyocr.Reader(['en']) def solve_puzzle(image): """Extracts the puzzle from the image and sends it to DeepSeek for solving.""" try: # 1. Save the uploaded image temporarily; EasyOCR uses file paths image_path = "uploaded_image.png" image.save(image_path) # 2. Extract text from the image using EasyOCR results = reader.readtext(image_path) extracted_text = " ".join([res[1] for res in results]) # Standardize the text to avoid misinterpretation of "??" as "2?" extracted_text = extracted_text.replace('??', '?') if "?" not in extracted_text: extracted_text += "?" print("Extracted Text:", extracted_text) # Debugging output # 3. Refine the extracted text to standardize expressions refined_text = extracted_text.replace('x', '*').replace('X', '*').replace('=', ' = ').strip() print("Refined Text:", refined_text) # Debugging output # 4. Compose the user message with concise instructions puzzle_prompt = ( f"You are an AI specialized in solving puzzles. Analyze the following, identify hidden patterns or rules, and provide the missing value with step-by-step reasoning in text format. Do not return an answer in Latex." f"\nPuzzle:\n{refined_text}\n" "Format your response strictly as follows:\n" "1. **Given Equation**:\n - (original equations)\n" "2. **Pattern Identified**:\n (explain the hidden logic)\n" "3. **Step-by-step Calculation**:\n - For (input values):\n (calculation and result)\n" "4. **Final Answer**:\n (Answer = X)" ) messages = [ {"role": "user", "content": puzzle_prompt} ] # 5. Optimized API request for faster response data = { "model": "deepseek-reasoner", "messages": messages, "temperature": 0, "max_tokens": 100 } headers = { "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" } # 6. Send the request to DeepSeek with a timeout response = requests.post(DEEPSEEK_API_URL, headers=headers, json=data, timeout=15) # 7. Check the result if response.status_code == 200: try: json_resp = response.json() return json_resp.get("choices", [{}])[0].get("message", {}).get("content", "Error: No response content.").strip() except json.JSONDecodeError: return "Error: Invalid JSON response from DeepSeek API." else: return f"Error: DeepSeek API failed with status code {response.status_code}, Response: {response.text}" except requests.exceptions.Timeout: return "Error: DeepSeek API request timed out. Please try again." except Exception as e: return f"Error: {str(e)}"
The solve_puzzle() function processes an image containing a logic puzzle and solves it using the OCR and R1 model. It follows these steps:
This pipeline combines OCR for text extraction and the DeepSeek API for intelligent puzzle-solving.
Gradio allows us to create an interactive web interface for our application. The following code snippet creates a user-friendly Gradio web interface for the solve_puzzle() function. The Gradio interface takes the user’s inputs and passes them to the model for validation.
!pip install torch gradio pillow easyocr -q
The above setup includes three components:
Let’s test our app with a puzzle that involve math and logic.
If you look at the first row, you’ll see 1 4 = 5, and you may say this is a simple addition. But on the second row we have 2 5 = 12, and then 3 6 = 21. Can you figure out the pattern and solve 8 11 = ?
If you look on the right side of the Gradio interface, you’ll see that the Puzzle Solver app has identified the pattern:
In this tutorial, we built a math puzzle solver assistant using DeepSeek R1 combined with OCR and Gradio to solve math puzzles. To keep up with the latest in AI, I recommend these blogs:
The above is the detailed content of DeepSeek-R1 Demo Project With Gradio and EasyOCR. For more information, please follow other related articles on the PHP Chinese website!