Crio Projects - Python interpreter | Crio.Do | Project-Based Learning Platform for Developers

Objective

Create a mini interpreter by implementing the functionality that parses variable assignments and evaluates expressions.


Project Context

An interpreter is a program that translates a high-level language into a low-level one when the program is run. You can write the program using a text editor or something similar and then instruct the interpreter to run the program. Interpreter takes the program, one line at a time and translates each line before running it: It translates the first line and runs it, then translates the second line and runs it etc.


But, what if we were to create our own interpreter? On taking up such a task, one will have to look behind the curtains and understand the various working principles an interpreter depends on. We will be talking about implementation of the same in the milestones of this project.


[Note: You are free to use any programming language of your choice.]


Project Stages

The project consists of the following stages:

image alt text

High-Level Approach

  • Aim to build a simple CLI(Command Line Interface) application.

  • Implement the functionality of parsing and validating a simple numerical assignment.

  • Validate whether the variable name is following standard conventions.

  • Implement addition, subtraction, multiplication, division and exponentiation using two operands.

  • Implement expression evaluation, keeping operator precedence in mind.

  • Add validation for expressions as well.

Pre-requisite skills

  • Python (or any programming language of choice)

Post Project Skills

  • String parsing

  • Infix expression evaluation

  • Lookup tables

Objective

Create a mini interpreter by implementing the functionality that parses variable assignments and evaluates expressions.


Project Context

An interpreter is a program that translates a high-level language into a low-level one when the program is run. You can write the program using a text editor or something similar and then instruct the interpreter to run the program. Interpreter takes the program, one line at a time and translates each line before running it: It translates the first line and runs it, then translates the second line and runs it etc.


But, what if we were to create our own interpreter? On taking up such a task, one will have to look behind the curtains and understand the various working principles an interpreter depends on. We will be talking about implementation of the same in the milestones of this project.


[Note: You are free to use any programming language of your choice.]


Project Stages

The project consists of the following stages:

image alt text

High-Level Approach

  • Aim to build a simple CLI(Command Line Interface) application.

  • Implement the functionality of parsing and validating a simple numerical assignment.

  • Validate whether the variable name is following standard conventions.

  • Implement addition, subtraction, multiplication, division and exponentiation using two operands.

  • Implement expression evaluation, keeping operator precedence in mind.

  • Add validation for expressions as well.

Pre-requisite skills

  • Python (or any programming language of choice)

Post Project Skills

  • String parsing

  • Infix expression evaluation

  • Lookup tables

Parse and validate simple numerical assignments

Parse strings that represent variable assignment statements and initialize the values in the respective variables. Also, add validation for the assignments.


Requirements

  • Parse numerical assignment statements and store values accordingly. Examples of such are:

    • a = 4

    • b = 3.3;

  • Choose an optimal data structure for storing the values.

  • Implement your own version of string tokenizer and use it in this project.

  • Add the functionality to parse exponents as assignments. Examples of such are:

    • x = 8.8e5

    • y = 2.7E10

  • Validate whether the literals on the right hand side of = are numerical.

    • Valid numeric literals are: 0.1e10, -01.1e-10, 0xFF(hexadecimal representation), 0xaF(hexadecimal representation), 0.34, 00, 0088

    • Invalid numeric literals are: 1u, .3, 3e0.1(exponents cannot have decimal numbers), 1f, 1000LL, 100L

  • Add validation for variable initialization using another variable.

    • For example, if we have a = 5 followed by b = a, the two statements are valid since 5 is a numeric value and a = 5 implies that b = 5. Also, a was encountered before b. The order plays a crucial role here.

    • If it would have been b = a followed by a = 5, then the first statement would be invalid, since we didn’t encounter a yet, and hence we don’t have it’s value.

  • The input can be taken from the CLI. The program needs to process the input and output the resultant value of all variables, in case of correct inputs, and will output ERROR details in case of invalid statements.


Expected Outcome

On completion of this milestone, we’ll have a basic interpreter in place, which can interpret variable assignments of numeric data. By interpret, we mean that your program will parse strings, validate the values and store them in respective variables.


Validate variable naming conventions

Every programming language has a set of variable conventions. These conventions help in maintaining uniform standards across multiple programs written by programmers.


Requirements

  • Add validation for proper variable naming conventions. Many programming languages have similar rules for naming their variables. For example, in C++ some of the valid identifiers are shyam, _max, j_47, name10, etc. and some invalid identifiers are: 4xyz, x-ray, abc 2, etc.

  • Your program should print ERROR messages on encountering invalid identifiers.


Bring it On!

  • In case of invalid identifiers:

    • Can you add the feature to your program to inform details of why an identifier is invalid?

    • Can you add a feature such that your program provides suggestions for identifier naming?


Expected Outcome

Your program should be able to differentiate between valid and invalid variable names.


Evaluate expressions and validate them

Add the functionality to evaluate arithmetic expressions and validate the same.


Requirements

  • Implement arithmetic operations such as addition(+), subtraction(-), multiplication(*), division(/) and exponentiation(**) using two operands. Your Interpreter should be able to evaluate the below expressions:

    • 5 + 3

    • 4.5 / 3

    • a + b

    • 25**0.5

  • Evaluate infix expressions.

  • Operator precedence should be handled.

    • a - b * c should be evaluated as a - (b * c)
  • Check for parenthesis match.

    • Valid : (a*(b+c))

    • Invalid: (a*(b+c)

  • Validate whether all of the tokens in the expression are valid.

    • 5 - 3 + 4 is valid

    • 6 + y - 4 would be invalid if we’ve not encountered y before

    • 5 * 5c is invalid as 5c is invalid

  • In case of arithmetic expressions as variable assignments, validate whether the right hand side of a variable assignment statement is a valid expression or not. For example:

    • a = b + c / d is valid if the interpreter has already encountered b, c and d

    • Similarly the following expression should work if the variables of the expression have already been encountered

      • res = a * b + c ** d + d-c + c/d * (a - b)
    • e = f * g will be invalid if either f or g weren’t encountered before this statement

    • a = 5c / d is invalid, as 5c is an invalid token

    • Suppose we have g = s + d - e / 23e2.5 and we’ve already encountered s, d and e, it’s still invalid because 23e2.5 is invalid, since exponents are always given as whole numbers

  • A proper ERROR message should be displayed on encountering invalid expressions.


Bring it On!

  • Can you answer the following question:

    • What is the time complexity of the function which adds/subtracts one digit at a time?
  • Implement all the above arithmetic operations for arbitrary precision (say 1000 digits precision).


Expected Outcome

By the end of this milestone, your program should be able to do the following:

  • Evaluate arithmetic expressions

  • Validate arithmetic expressions


Improving the efficiency of the interpreter

Efficiency plays a vital role in software usability. On completion of the earlier milestones, we have an interpreter in place with basic functionality. But, it can be made more efficient by making adjustments to some micro components such as better variable accessibility and efficient calculations.


Note: This Milestone is not mandatory, but it improves the performance of your software.


Requirements

  • The key value pair of assignment statements form perfect lookup table entries. After validating expressions enter them in the lookup table(use a hashing based data structure). The idea is that you should be able to look up values based on variable names.

  • For better complexity of multiplication, use the Karatsuba algorithm.

  • For better complexity of division, use the Newton–Raphson division.

  • Find a way to perform fast exponentiation.


Expected Outcome

By the end of this milestone, you should have an efficient interpreter in place with basic functionalities.


Publish to GitHub

Great job!


Now that you’ve successfully completed all the milestones, you can show off your work to your peers! Go on and publish your new piece of software on GitHub and get some nice green pixels!


[Note: Kindly go through this Byte if you’re unfamiliar with Git.]