Normal Forms in Context-Free Grammars (CFGs)

Normal Forms in Context-Free Grammars (CFGs) are specific standardized formats for representing production rules. Converting a CFG into a normal form helps in simplifying parsing algorithms, proving theoretical properties, and implementing computational models. The two most common normal forms are Chomsky Normal Form (CNF) and Greibach Normal Form (GNF).


Chomsky Normal Form (CNF)

Chomsky Normal Form (CNF) is a simplified form of CFG where each production rule adheres to one of the following formats:

  • A → BC: Where A, B, and C are non-terminal symbols, and B and C are not the start symbol.
  • A → a: Where A is a non-terminal, and a is a terminal symbol.
  • S → ε: Allowed only if ε is part of the language and S is the start symbol.

CNF is particularly useful for algorithms like the CYK (Cocke–Younger–Kasami) parsing algorithm.

Steps to Convert a CFG to CNF

  1. Eliminate ε-Productions: Remove all productions of the form A → ε, except when ε is in the language.
  2. Eliminate Unit Productions: Remove productions of the form A → B, where A and B are non-terminals.
  3. Convert to Binary Productions: Replace right-hand sides with more than two non-terminals (e.g., A → BCD) with binary productions (e.g., A → BX, X → CD).
  4. Replace Terminals in Mixed Productions: For rules like A → aB, introduce a new non-terminal (e.g., C → a) and rewrite the rule as A → CB.

Example: Convert the grammar:

</>
Copy
S → AB | a
A → BC | ε
B → b
C → c

After applying the steps:

</>
Copy
S → AB | a
A → BC
B → b
C → c

Greibach Normal Form (GNF)

Greibach Normal Form (GNF) is another standardized form of CFG where every production rule adheres to the format:

  • A → aα: Where A is a non-terminal, a is a terminal, and α is a string of non-terminals (or empty).

GNF is particularly useful for designing top-down parsers, as it ensures that derivations always start with a terminal symbol.

Steps to Convert a CFG to GNF

  1. Eliminate Left Recursion: Remove productions where a non-terminal derives itself in the leftmost position (e.g., A → Aa | b).
  2. Ensure Proper Order: Rewrite the grammar so that the right-hand side of each rule starts with a terminal or empty string.
  3. Substitute Non-Terminals: Replace non-terminals on the right-hand side with their equivalent productions until all rules conform to GNF.

Example: Convert the grammar:

</>
Copy
S → AB | a
A → aA | b
B → bB | ε

After converting to GNF:

</>
Copy
S → aA | a
A → aA | b
B → bB | b

Comparison of CNF and GNF

Both CNF and GNF standardize CFGs but serve different purposes:

AspectChomsky Normal Form (CNF)Greibach Normal Form (GNF)
Production RulesBinary productions (A → BC) and terminal productions (A → a).Every production starts with a terminal (A → aα).
ParsingUseful for CYK algorithm (bottom-up parsing).Useful for top-down parsers.
ComplexitySimpler for computational purposes.More restrictive but aids in leftmost derivations.

Key Takeaways

  • Normal forms like CNF and GNF standardize CFGs for theoretical and practical purposes.
  • CNF is best suited for bottom-up parsing algorithms like CYK.
  • GNF ensures productions begin with a terminal, aiding in top-down parsing.
  • Converting CFGs to normal forms involves removing ε-productions, unit productions, and left recursion.
  • Understanding normal forms is essential for automata theory, parsing, and compiler design.