Normal Forms in Context-Free Grammars (CFGs)
Normal Forms in Context-Free Grammars (CFGs) are specific standardized formats for representing production rules. Converting a CFG into a normal form helps in simplifying parsing algorithms, proving theoretical properties, and implementing computational models. The two most common normal forms are Chomsky Normal Form (CNF) and Greibach Normal Form (GNF).
Chomsky Normal Form (CNF)
Chomsky Normal Form (CNF) is a simplified form of CFG where each production rule adheres to one of the following formats:
A → BC
: WhereA
,B
, andC
are non-terminal symbols, andB
andC
are not the start symbol.A → a
: WhereA
is a non-terminal, anda
is a terminal symbol.S → ε
: Allowed only ifε
is part of the language andS
is the start symbol.
CNF is particularly useful for algorithms like the CYK (Cocke–Younger–Kasami) parsing algorithm.
Steps to Convert a CFG to CNF
- Eliminate ε-Productions: Remove all productions of the form
A → ε
, except whenε
is in the language. - Eliminate Unit Productions: Remove productions of the form
A → B
, whereA
andB
are non-terminals. - Convert to Binary Productions: Replace right-hand sides with more than two non-terminals (e.g.,
A → BCD
) with binary productions (e.g.,A → BX, X → CD
). - Replace Terminals in Mixed Productions: For rules like
A → aB
, introduce a new non-terminal (e.g.,C → a
) and rewrite the rule asA → CB
.
Example: Convert the grammar:
S → AB | a
A → BC | ε
B → b
C → c
After applying the steps:
S → AB | a
A → BC
B → b
C → c
Greibach Normal Form (GNF)
Greibach Normal Form (GNF) is another standardized form of CFG where every production rule adheres to the format:
A → aα
: WhereA
is a non-terminal,a
is a terminal, andα
is a string of non-terminals (or empty).
GNF is particularly useful for designing top-down parsers, as it ensures that derivations always start with a terminal symbol.
Steps to Convert a CFG to GNF
- Eliminate Left Recursion: Remove productions where a non-terminal derives itself in the leftmost position (e.g.,
A → Aa | b
). - Ensure Proper Order: Rewrite the grammar so that the right-hand side of each rule starts with a terminal or empty string.
- Substitute Non-Terminals: Replace non-terminals on the right-hand side with their equivalent productions until all rules conform to GNF.
Example: Convert the grammar:
S → AB | a
A → aA | b
B → bB | ε
After converting to GNF:
S → aA | a
A → aA | b
B → bB | b
Comparison of CNF and GNF
Both CNF and GNF standardize CFGs but serve different purposes:
Aspect | Chomsky Normal Form (CNF) | Greibach Normal Form (GNF) |
---|---|---|
Production Rules | Binary productions (A → BC ) and terminal productions (A → a ). | Every production starts with a terminal (A → aα ). |
Parsing | Useful for CYK algorithm (bottom-up parsing). | Useful for top-down parsers. |
Complexity | Simpler for computational purposes. | More restrictive but aids in leftmost derivations. |
Key Takeaways
- Normal forms like CNF and GNF standardize CFGs for theoretical and practical purposes.
- CNF is best suited for bottom-up parsing algorithms like CYK.
- GNF ensures productions begin with a terminal, aiding in top-down parsing.
- Converting CFGs to normal forms involves removing ε-productions, unit productions, and left recursion.
- Understanding normal forms is essential for automata theory, parsing, and compiler design.