Chomsky Hierarchy

The Chomsky Hierarchy, introduced by Noam Chomsky in 1956, is a classification of formal languages based on their generative power. It defines four types of languages, each associated with a specific type of grammar and computational model. This hierarchy is foundational in the theory of computation and linguistics, as it describes the relationships between different classes of languages and the machines that recognize them.


Classification of Languages and Grammars

The Chomsky Hierarchy consists of four levels, each corresponding to a class of formal languages:

  1. Type 0: Recursively Enumerable Languages
  2. Type 1: Context-Sensitive Languages
  3. Type 2: Context-Free Languages
  4. Type 3: Regular Languages

Each type of language is generated by a corresponding type of grammar and recognized by a specific computational model.


Type 0: Recursively Enumerable Languages

Recursively Enumerable Languages are the most general class in the hierarchy. They are generated by unrestricted grammars and recognized by Turing Machines.

Grammar Rules

Productions in Type 0 grammars are of the form:

</>
Copy
α → β

Where α and β are strings of terminals and non-terminals, and α must contain at least one non-terminal.

Examples:

  • The Halting Problem language.
  • Languages recognized by any arbitrary Turing Machine.

Type 1: Context-Sensitive Languages

Context-Sensitive Languages are generated by context-sensitive grammars (CSGs) and recognized by Linear Bounded Automata (LBAs).

Grammar Rules

Productions in Type 1 grammars are of the form:

</>
Copy
αAβ → αγβ

Where A is a non-terminal, α and β are strings of terminals and non-terminals, and γ is a non-empty string.

Examples:

  • L = {aⁿbⁿcⁿ | n ≥ 1} (equal numbers of a‘s, b‘s, and c‘s).
  • Languages that require additional context to parse.

Type 2: Context-Free Languages

Context-Free Languages (CFLs) are generated by context-free grammars (CFGs) and recognized by Pushdown Automata (PDAs).

Grammar Rules

Productions in Type 2 grammars are of the form:

</>
Copy
A → γ

Where A is a single non-terminal, and γ is a string of terminals and non-terminals.

Examples:

  • L = {aⁿbⁿ | n ≥ 1} (equal numbers of a‘s and b‘s).
  • Languages that can be described by balanced parentheses.

Type 3: Regular Languages

Regular Languages are generated by regular grammars and recognized by Finite Automata (FA), including deterministic (DFA) and nondeterministic (NFA) models.

Grammar Rules

Productions in Type 3 grammars are of the form:

</>
Copy
A → aB | a

Where A and B are non-terminals, and a is a terminal.

Examples:

  • L = {aⁿ | n ≥ 0} (strings of a‘s).
  • Languages like binary strings divisible by a fixed number.

Key Takeaways

  • Chomsky Hierarchy: Classifies languages into four types: recursively enumerable, context-sensitive, context-free, and regular.
  • Associated Models: Each type is associated with specific grammars and computational models (e.g., Turing Machines for Type 0, Finite Automata for Type 3).
  • Expressive Power: Higher levels in the hierarchy can generate more complex languages, but they also require more computational resources.
  • Applications: Understanding the hierarchy is crucial for parsing, language processing, and designing algorithms in theoretical and applied computer science.