I recently added support to the CodeConverter.net SDK to convert PowerShell to C#. This was a very similar process to converting C# to PowerShell. The implementation walks the PowerShell AST, produces a common AST and then is fed into the C# code writer to output C# code.
The limitation of straight AST translation
AST translation works fine for syntaxes that make sense between the two languages. Converting for loops, if statements and method invocations is trivial in languages based on .NET or even a modern C-style language. The snag comes when attempting to translate between language elements that do not exist in the target language.
In the example of PowerShell to C# conversion, those snags are cmdlets. PowerShell cmdlets encompass a large set of functionality that isn’t easy to translate to a simple AST conversion that is supported by C#. The source AST visitor would need to know about the semantics of the target language to correctly create an AST that described the desired intent of the code segment. This type of architecture would not scale so it is desirable to abstract the intent from the implementation.
The CodeConverter.NET SDK now has the concept of intents. AST nodes describe the physical structure of a source document. An intent on the other hand describes the intended outcome of a code segment. AST nodes can now be augmented with an intent that describes what the node is attempting to do. Some examples of intents are writing to a file, outputting to the console or getting a list of processes.
The syntax visitor needs the intelligence to recognize an intent, analyze the necessary nodes and include an intent description with the accompanying AST node in order to reproduce the same logic in a different language and structure. The code writer needs to understand when to look for an intent on a type of AST node and output the appropriate code; even when the code is completely different than the AST it was provided.
Determining the intent of a PowerShell cmdlet is very easy. You can just read the name of the cmdlet and understand what the user is attempting to do. The PowerShell AST visitor can produce intents by investigating cmdlet usage and pinning that to the AST node when it passes it code writers. An example of this is translating Start-Process to Process.Start in C#. As you can see below, the AST for each of these code snippets is vastly different.
The intent of code is endless and it would be impossible to understand every single code intent. It’s also much harder to determine the intent of Java or C# code because it doesn’t come down to a single AST node. A more intelligent AST walking algorithm is necessary. It really complicates the implementation and seems like a good use of machine learning or a entire cube farm full of programmers.
For PowerShell to C# conversions, the cmdlet list is not endless; albeit large. It’s possible to automate the walking of the AST and generate the code to do so. From there, implementing the code writing is an exercise in determination but very possible. My goal is to target the high value cmdlets first. Where-Object, ForEach-Object and Get-Process are all on my list. Feel free to open issues on GitHub.
If you want to give the code conversion a shot, the latest SDK is now running on CodeConverter.NET.