ChatGPT for language development

digital-ember · February 11, 2023, 10:09am

Played around with ChatGPT asking it to help me designing a DSL. Once I was happy with the language design, I asked it to write a parser and was surprised by the answer. Maybe it was just my line of questioning and there are ways of getting help from it writing a parser. Just found it’s answer curious and worth sharing here. Anyone here successfully had it write a parser?

meinte.boersma · February 15, 2023, 1:26pm

It would be interesting to see the rest of that convo as well!

digital-ember · February 17, 2023, 8:23am

Can’t really share the actual convo, cause it was for work, but it was nothing special, really.
I just descibed the requirements for a DSL and the target group I have in mind in two paragraphs and asked it to create the language.
The thing suggested a very JS-like syntax first printing some example program. I asked if it could change it to be more natural-language-like, and it successfully transformed the example into “english-like” sentences (way too verbose). That’s when I asked it to write a parser (what you can see in my original post).

I continued asking it to write me a grammar file in ANTLR4, and it basically did that, but just a grammar file for that concrete example it gave earlier, i.e. all terminal rules were just constants

jasonsbarr · February 28, 2023, 8:11pm

Yes, I’ve been able to get ChatGPT to help with writing various components of a compiler including parsers. I’m currently having it help me with a complex bidirectional type checker and it’s been an illuminating experience.

What I’ve found works is if you describe to it in detail the kind of task you’re looking for, then start small and build up to larger features incrementally.

So, for my current task, I don’t just say “Write me a type checker for a language with x, y, and z features.” I start by describing the language features and type system, at each step checking in with something like “does that make sense?” to get confirmation from the bot (without it then launching into a massive soliloquoy of observations based on what I’ve just given it, which I’ve found it prone to doing if I don’t hone in the kind of response I’m looking for).

Then I have it help me define the types for AST nodes (I’m using TypeScript for various reasons, but there’s no reason you couldn’t use any language). Then I ask it some general questions about bidirectional type checking and to come up with an overall algorithm for inferring and checking types. Then, starting with the different AST nodes and Type type definitions, I have it write individual inference and checker functions one at a time.

So, for example, after doing all the work with AST nodes and defining types for Types, I had it write this overall inference function:

function inferExpressionType(expr: Expression, typeEnv: TypeEnvironment, { constant = false, inTypePosition = false } = {}): Type {
  switch (expr.type) {
    case "NumericLiteral":
      return inferNumericLiteralType(constant);
    case "StringLiteral":
      return inferStringLiteralType(constant);
    case "BooleanLiteral":
      return inferBooleanLiteralType(constant);
    case "NilLiteral":
      return inferNilLiteralType(constant);
    case "Identifier":
      return inferIdentifierType(expr, typeEnv, {constant, inTypePosition});
    case "CallExpression":
      return inferCallExpressionType(expr, typeEnv, constant);
    case "ListLiteral":
      return inferListLiteralType(expr, typeEnv, constant);
    case "VectorLiteral":
      return inferVectorLiteralType(expr, typeEnv, constant);
    case "SetLiteral":
      return inferSetLiteralType(expr, typeEnv, constant);
    case "DictLiteral":
      return inferDictLiteralType(expr, typeEnv, constant);
	case "TupleLiteral":
	  return inferTupleLiteralType(expr, typeEnv, constant);
	case "ObjectLiteral":
	  return inferObjectLiteralType(expr, typeEnv, constant);
	case "MemberExpression":
	  return inferMemberExpressionType(expr, typeEnv, constant);
	case "SliceExpression":
	  return inferSliceExpressionType(expr, typeEnv, constant);
    case "BinaryExpression":
      return inferBinaryExpressionType(expr, typeEnv, constant);
    case "UnaryExpression":
      return inferUnaryExpressionType(expr, typeEnv, constant);
	case "UpdateExpression":
	  return inferUpdateExpressionType(expr, typeEnv, constant);
    case "AssignmentExpression":
      return inferAssignmentExpressionType(expr, typeEnv, constant);
    case "IfExpression":
      return inferIfExpressionType(expr, typeEnv, constant);
    case "CondExpression":
      return inferCondExpressionType(expr, typeEnv, constant);
	case "LambdaExpression":
      return inferLambdaExpressionType(expr, typeEnv, constant);
    case "LetExpression":
      return inferLetExpressionType(expr, typeEnv, constant);
    case "MatchExpression":
      return inferMatchExpressionType(expr, typeEnv, constant);
    default:
      throw new Error(`Unhandled expression type: ${(expr as Expression).type}`); // use diagnostic with source location instead of throwing error
  }
}

Then we continued by spelling out each individual inference function, one at a time. Some of them get pretty complex (not least because we’re doing generic and union/intersection types), so it’s a back and forth process of me explaining what I want and then offering my own observations (and, occasionally, corrections) as needed to get the output I’m looking for.

It’s a lot more involved than just saying “Write me a parser for x, y, and z language features,” but it seems to work.

I also got a good response when I asked it “Show me an example of Vaughan Pratt’s top down operator precedence parsing algorithm written in F#,” as well as “Can you show me how to use Vaughan Pratt’s top down operator precedence algorithm to parse OCaml expressions, function declarations, and function calls? In TypeScript this time.”

It needs a lot of guidance, but I’ve still found it useful.

jasonsbarr · February 28, 2023, 8:50pm

I’m not sure that working with ChatGPT has saved me a ton of time, but it’s been an enjoyable experience so far. Also, it has caught cases I probably would have missed on my own, so it’s likely that it’s saved me some amount of debugging time.

jasonsbarr · February 28, 2023, 10:56pm

Oh yes, and every so often during the chat you’ll need to remind ChatGPT of something you’ve already discussed with it and possibly defined in the code, because its memory only handles something like 3,000 words of your conversation. So I’ve had to re-enter a lot of code snippets more than once.

digital-ember · March 6, 2023, 7:50am

Thanks for sharing that, Jason.
At least it sounds like there is an inreased chance for confirmation bias

[…] if I don’t hone in the kind of response I’m looking for

I’ve made similar experiences (not specifically when having it help me write code, but in general), that the thing tries to be very affirmative, even if what I say is clearly wrong. It scares me, and when it comes to writing code, while impressive, I definitely did not get the feeling that I became more productive thanks to ChatGPT. That might just be my very subjective experience, and I realize it might be vastly different for others.

jasonsbarr · March 6, 2023, 4:03pm

I’ve found that the more precise I can be with telling it what I want, which sometimes includes having to give it portions of a library’s documentation when doing research on that library, the better (and more efficiently) I can get results. Depending on what you’re trying to do with it, it may not be worth the extra effort. It has been for me, but that’s just for me.

meinte.boersma · March 14, 2023, 8:39am

So far, ChatGPT looks to be (at least to me) a bunch of dice that make you win the game impressively often…provided you know which game you’re playing…

(As in: you have to know how to validate the answers’ correctness, and an idea how to improve the answers. That implies already having plenty of domain knowledge.)

digital-ember · March 14, 2023, 10:02am

Can I steal that?

meinte.boersma · March 14, 2023, 11:55am

Stealing including attribution: