This week, I've been working on on a command-line tool I named codeshift, which lets users input source code files, choose a programming language, and translates them into their chosen language.
There's no fancy stuff going on under the hood - it just uses an AI provider called Groq to handle the translation - but I wanted to get into the development process, how it's used, and what features it offers.
Command-line tool that transforms source code files into any language.
codeshift [-o
codeshift -o index.go go examples/index.js
codeshift [-o
For example, to translate the file examples/index.js to Go and save the output to index.go:
codeshift -o index.go go examples/index.js
I've been working on this project as part of the Topics in Open Source Development course at Seneca Polytechnic in Toronto, Ontario. Starting out, I wanted to stick with technologies I was comfortable with, but the instructions for the project encouraged us to learn something new, like a new programming language or a new runtime.
Although I'd been wanting to learn Java, after doing some research online, it seemed like it wasn't a great choice for developing a CLI tool or interfacing with AI models. It isn't officially supported by OpenAI, and the community library featured in their docs is deprecated.
I've always been one to stick with the popular technologies - they tend to be reliable and have complete documentation and tons of information available online. But this time, I decided to do things differently. I decided to use Bun, a cool new runtime for JavaScript meant to replace Node.
Turns out I should've stuck with my gut. I ran into trouble trying to compile my project and all I could do was hope the developers would fix the issue.
Referenced previously here, closed without resolution: https://github.com/openai/openai-node/issues/903
This is a pretty big issue as it prevents usage of the SDK while using the latest Sentry monitoring package.
import * as Sentry from '@sentry/node'; // Start Sentry Sentry.init({ dsn: "https://your-sentry-url", environment: "your-env", tracesSampleRate: 1.0, // Capture 100% of the transactions });
const params = { model: model, stream: true, stream_options: { include_usage: true }, messages }; const completion = await openai.chat.completions.create(params);
Results in error:
TypeError: getDefaultAgent is not a function at OpenAI.buildRequest (file:///my-project/node_modules/openai/core.mjs:208:66) at OpenAI.makeRequest (file:///my-project/node_modules/openai/core.mjs:279:44)
(Included)
All operating systems (macOS, Linux)
v20.10.0
v4.56.0
This turned me away from Bun. I'd found out from our professor we were going to compile an executable later in the course, and I did not want to deal with Bun's problems down the line.
So, I switched to Node. It was painful going from Bun's easy-to-use built-in APIs to having to learn how to use commander for Node. But at least it wouldn't crash.
I had previous experience working with AI models through code thanks to my co-op, but I was unfamiliar with creating a command-line tool. Configuring the options and arguments turned out to be the most time-consuming aspect of the project.
Apart from the core feature we chose for each of our projects - mine being code translation - we were asked to implement any two additional features. One of the features I chose to implement was to save output to a specified file. Currently, I'm not sure this feature is that useful, since you could just redirect the output to a file, but in the future I want to use it to extract the code from the response to the file, and include the AI's rationale behind the translation in the full response to stdout. Writing this feature also helped me learn about global and command-based options using commander.js. Since there was only one command (run) and it was the default, I wanted the option to show up in the default help menu, not when you specifically typed codeshift help run, so I had to learn to implement it as a global option.
I also ended up "accidentally" implementing the feature for streaming the response to stdout. I was at first scared away from streaming, because it sounded too difficult. But later, when I was trying to read the input files, I figured reading large files in chunks would be more efficient. I realized I'd already implemented streaming in my previous C++ courses, and figuring it wouldn't be too bad, I got to work.
Then, halfway through my implementation I realized I'd have to send the whole file at once to the AI regardless.
But this encouraged me to try streaming the output from the AI. So I hopped on MDN and started reading about ReadableStreams and messing around with ReadableStreamDefaultReader.read() for what felt like an hour - only to scroll down the AI provider's documentation and realize all I had to do was add stream: true to my request.
Either way, I may have taken the scenic route but I ended up implementing streaming.
Right now, the program parses each source file individually, with no shared context. So if a file references another, it wouldn't be reflected in the output. I'd like to enable it to have that context eventually. Like I mentioned, another feature I want to add is writing the AI's reasoning behind the translation to stdout but leaving it out of the output file. I'd also like to add some of the other optional features, like options to specify the AI model to use, the API key to use, and reading that data from a .env file in the same directory.
That's about it for this post. I'll be writing more in the coming weeks.
The above is the detailed content of Building codeshift. For more information, please follow other related articles on the PHP Chinese website!