Any large-scale software is not fully conceived from the beginning, but is improved, edited, unit tested, repaired by developers, solved by code review, and solved again and again until it is satisfied and goes online. The code can be merged into the warehouse only after the requirements are met.
The knowledge of controlling the entire process is called Software Engineering.
Software engineering is not an independent process, but consists of developers, code reviewers, bug reporters, software architects and various development tools (such as compilers, unit tests, Connector, static analyzer).
Recently, Google announced its own DIDACT (Dynamic Integrated Developer ACTivity, dynamic integrated developer activity) framework, which uses AI technology to enhance software engineering and integrate software development The intermediate states are used as training data to assist developers in writing and modifying code, and understand the dynamics of software development in real time.
DIDACT is a multi-task model trained on development activities including editing, debugging, fixing and code review
The researchers built and deployed three DIDACT tools in-house, Annotation Parsing, Build Repair, and Tip Prediction, each integrated at different stages of the development workflow.
For decades, Google’s software engineering tool chain stored every operation related to code as a tool and development Logs of interactions between people.
In principle, users can use these records to replay in detail the key change process in the software development process, that is, how Google's code base was formed, including every code edit, Compilation, annotation, variable renaming, etc.
Google's development team will store the code in a monorepo (mono repository), which is a code repository that contains all tools and systems.
Software developers typically make code modifications in a local copy-on-write workspace managed by Clients in the Cloud (CitC) systems. experiment.
When a developer is ready to package a set of code changes together to achieve a certain task (such as fixing a bug), he or she needs to create a code change in Critique, Google's code review system. Changelist (CL).
Like common code review systems, developers communicate with peer reviewers about functionality and style, and then edit the CL to address issues raised during review comments.
Eventually, the reviewer declared the code "LGTM!" and merged the CL into the code base.
Of course, in addition to conversations with code reviewers, developers also need to maintain a large number of "dialogues" with other software engineering tools, including compilers, test frameworks, linkers, Static analyzers, fuzz testing tools, etc.
An illustration of the complex network of activities involved in software development: the activities of developers, interactions with code reviewers, and the use of tools such as compilers transfer.
DIDACT leverages the interaction between engineers and tools to empower machine learning models by suggesting or optimizing developers’ execution of software actions during engineering tasks to assist Google developers in participating in the software engineering process.
To this end, the researchers defined a number of tasks regarding individual developer activities: fixing broken builds, predicting code review comments, processing code review comments, renaming variables, editing files, etc. .
Then define a common form for each activity: get a certain State (code file), an Intent (annotation specific to an activity, such as code review annotation or compilation processor error) and generate an Action (an operation for processing the task).
Action is like a mini programming language that can be expanded into newly added activities, covering editing, adding comments, renaming variables, marking code errors, etc. It can also be called this The first language is DevScript.
The input prompts of the DIDACT model are tasks, code snippets and comments related to the task, and the output is development actions, such as editing or comments
Status- The definition form of Intent-Action (State-Intent-Action) can capture different tasks in a common way. More importantly, DevScript can express complex actions concisely without the need to output the entire state after the action occurs ( original code), making the model more efficient and interpretable.
For example, renaming may modify multiple places in the code file, but the model only needs to predict one renaming operation.
DIDACT runs very well on personal auxiliary tasks. For example, the following example demonstrates the code of DIDACT after the function is completed. For cleanup work, first enter the code reviewer's final comments (marked human in the picture), and then predict the operations required to solve the problems raised in the comments (shown with diff).
Given an initial snippet of code and the comments the code reviewer attached to the snippet, DIDACT's Pre-Submit Cleanup task generates a Editing operations (insertion and deletion of text)
The multi-modal nature of DIDACT also gives rise to some completely new behaviors that emerge with scale, one of which is history enhancement ( history augmentation), this capability can be enabled via prompts. Knowing what the developer has done recently allows the model to better predict what the developer should do next.
##Demonstration of historical enhanced code completion
The history enhanced code completion task can demonstrate this ability. In the example above, the developer added a new function parameter (1) and moved the cursor into the document (2). Based on the developer's editing history and cursor position, the model is able to accurately predict the docstring entry for the new parameter and complete the third step.
In the more difficult task of history-augmented edit prediction, the model is able to select the location of the next edit in a historically consistent manner.
Demonstration of edit prediction over multiple chained iterations
If a developer removes a function parameter (1), the model can correctly predict an update to the docstring (2) that removes the parameter based on history (without requiring a human developer to manually place the cursor there), and in the syntax correctly (and arguably semantically) update the statement in function (3).
With the history, the model can clearly decide how to correctly continue the "editing code process", but without the history, the model has no way of knowing that the missing function parameters were intentional ( Because the developer was doing a longer editing operation to remove the parameter) or was it an unexpected situation (the model should re-add the parameter to fix the problem).
In addition, the model can also complete more tasks, such as starting from a blank file and requiring the model to continuously predict the next editing operations until a complete code is written. document.
Most importantly, the model assists in writing code in a step-by-step manner that is natural to developers:
Start by creating a complete working framework with imports, flags, and a basic main function; then gradually add new functionality, such as reading and writing results from files, and adding filtering of certain lines based on user-supplied regular expressions Function.
ConclusionDIDACT transforms Google’s software development process into training demos for machine learning developer assistants and uses these demo data to train models in a step-by-step manner Build code, interact with tools and code reviewers.
The DIDACT approach complements the great achievements of large-scale language models from Google and others to reduce workload, increase productivity, and improve the quality of software engineers' work.
The above is the detailed content of Google discloses its own 'AI+ software engineering' framework DIDACT: Thousands of developers have tested it internally, and they all say it is highly productive after using it. For more information, please follow other related articles on the PHP Chinese website!