Grammar-Based Fuzzing with Jest: Testing a Simple Parser
Grammar-based fuzzing is a powerful technique for generating test inputs based on a defined grammar. This challenge asks you to implement a basic grammar-based fuzzer within a Jest testing environment to test a simple expression parser. The goal is to generate diverse and potentially invalid expressions to uncover vulnerabilities or unexpected behavior in the parser.
Problem Description
You are tasked with creating a Jest test suite that utilizes grammar-based fuzzing to test a provided expression parser function. The parser, parseExpression, takes a string representing a mathematical expression and attempts to evaluate it. The grammar defines the valid structure of these expressions, and your fuzzer will generate strings conforming to this grammar, as well as slightly malformed strings to test error handling.
What needs to be achieved:
- Define a Grammar: You'll be provided with a simplified grammar for mathematical expressions. This grammar will specify the allowed operators (+, -, *, /), numbers, and parentheses.
- Implement a Fuzzer: Create a function
generateExpression(grammar: Grammar, max_length: number): stringthat generates strings based on the provided grammar. Themax_lengthparameter limits the length of the generated expression to prevent excessively long inputs. - Create a Jest Test Suite: Write a Jest test suite that uses the fuzzer to generate a set of expressions. For each generated expression, the test suite should:
- Call the
parseExpressionfunction with the generated expression. - Assert that the function either returns a valid result (if the expression is valid) or throws an expected error (if the expression is invalid).
- Call the
- Error Handling: The
parseExpressionfunction is expected to throw an error if the input expression is syntactically invalid. Your tests should specifically check for these errors.
Key Requirements:
- The fuzzer must generate strings that adhere to the provided grammar.
- The test suite must generate a reasonable number of expressions (e.g., 100) to provide good coverage.
- The tests must handle both valid and invalid expressions correctly.
- The code must be well-structured and readable.
Expected Behavior:
The test suite should pass if the parseExpression function correctly parses valid expressions and throws appropriate errors for invalid expressions, as determined by the grammar. The fuzzer should generate a diverse set of expressions, including edge cases and slightly malformed inputs.
Edge Cases to Consider:
- Empty expressions.
- Expressions with only numbers.
- Expressions with only operators.
- Expressions with unbalanced parentheses.
- Expressions with division by zero.
- Expressions with invalid characters.
- Very long expressions (to test performance and potential stack overflow).
Examples
Example 1:
Input: Grammar: { start: 'expression' }, expression: ['term', 'expression_tail'] , expression_tail: ['+', 'term'], ['-', 'term'], '' , term: ['factor', 'term_tail'], term_tail: ['*', 'factor'], ['/', 'factor'], '', factor: ['number', 'parenthesized_expression'], parenthesized_expression: ['(', 'expression', ')'], number: [/[0-9]+/], max_length: 10
Output: "1+2*3"
Explanation: The fuzzer generates a valid expression conforming to the grammar.
Example 2:
Input: Grammar: { start: 'expression' }, expression: ['term', 'expression_tail'] , expression_tail: ['+', 'term'], ['-', 'term'], '' , term: ['factor', 'term_tail'], term_tail: ['*', 'factor'], ['/', 'factor'], '', factor: ['number', 'parenthesized_expression'], parenthesized_expression: ['(', 'expression', ')'], number: [/[0-9]+/], max_length: 5
Output: "1+"
Explanation: The fuzzer generates a valid, albeit incomplete, expression.
Example 3:
Input: Grammar: { start: 'expression' }, expression: ['term', 'expression_tail'] , expression_tail: ['+', 'term'], ['-', 'term'], '' , term: ['factor', 'term_tail'], term_tail: ['*', 'factor'], ['/', 'factor'], '', factor: ['number', 'parenthesized_expression'], parenthesized_expression: ['(', 'expression', ')'], number: [/[0-9]+/], max_length: 8
Output: "(1+2"
Explanation: The fuzzer generates an invalid expression due to unbalanced parentheses. The test should assert that `parseExpression` throws an error.
Constraints
- Grammar Format: The grammar will be provided as a JavaScript object with rules defined as strings and arrays of production rules. See the examples for the expected format.
max_length: Themax_lengthparameter for the fuzzer should be between 5 and 20 (inclusive).- Number of Tests: The Jest test suite should generate at least 50 and no more than 200 expressions.
parseExpressionFunction: You will be provided with theparseExpressionfunction. You are not allowed to modify it.- Error Type: The
parseExpressionfunction is expected to throw anErrorobject when an invalid expression is encountered.
Notes
- Consider using a recursive descent approach for the fuzzer to easily traverse the grammar.
- Introduce randomness into the fuzzer to generate diverse expressions.
- Think about how to balance the generation of valid and invalid expressions. A good mix will provide more comprehensive testing.
- The provided
parseExpressionfunction is intentionally simplified. Focus on the fuzzing and testing aspects of the problem. - You'll need to install Jest:
npm install --save-dev jest @types/jest - You'll need to create a
jest.config.jsfile (can be minimal). - The
parseExpressionfunction will be provided separately. Assume it exists and is accessible in your test file.
// Example parseExpression function (provided separately)
// function parseExpression(expression: string): number {
// // Implementation details omitted for brevity
// throw new Error("Not implemented");
// }