MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning - Summary

The paper discusses the limitations of large language models (LMs) and proposes a neuro-symbolic architecture called the Modular Reasoning, Knowledge and Language (MRKL) system that combines LMs with external knowledge sources and discrete reasoning modules to overcome these limitations.

Arxiv URL: https://arxiv.org/abs/2205.00445

Authors: Ehud Karpas, Omri Abend, Yonatan Belinkov, Barak Lenz, Opher Lieber, Nir Ratner, Yoav Shoham, Hofit Bata, Yoav Levine, Kevin Leyton-Brown, Dor Muhlgay, Noam Rozen, Erez Schwartz, Gal Shachaf, Shai Shalev-Shwartz, Amnon Shashua, Moshe Tenenholtz

Summary:

MRKL (pronounced "miracle") is a neuro-symbolic architecture that combines:

  • Neural modules (including large language models and specialized smaller models)
  • Symbolic modules (calculators, databases, APIs, etc.)
  • A router that directs inputs to appropriate modules

The paper discusses the limitations of large language models (LMs) and proposes a neuro-symbolic architecture called the Modular Reasoning, Knowledge and Language (MRKL) system that combines LMs with external knowledge sources and discrete reasoning modules to overcome these limitations.

Key Insights & Learnings:

  • Large language models have inherent limitations due to lack of access to current and proprietary information sources, and lack of reasoning capabilities.
  • Traditional approaches to deploying LMs result in model explosion and loss of versatility.
  • The MRKL system consists of an extendable set of modules, including neural and symbolic modules, and a router that routes inputs to the best module.
  • The MRKL system offers safe fallback, robust extensibility, interpretability, up-to-date information, access to proprietary knowledge, and compositionality.
  • The MRKL system has been implemented as Jurassic-X by AI21 Labs and is being piloted.

Key Benefits of MRKL Systems:

  • Safe fallback to the general-purpose LM when needed
  • Robust extensibility (new capabilities can be added without retraining everything)
  • Better interpretability through module-specific explanations
  • Access to up-to-date and proprietary information
  • Ability to compose multiple modules for complex tasks


Terms Mentioned: neuro-symbolic architecture, large language models, external knowledge sources, discrete reasoning modules, Modular Reasoning, Knowledge and Language (MRKL) system, neural modules, symbolic modules, model explosion, versatility, safe fallback, robust extensibility, interpretability, up-to-date information, proprietary knowledge, compositionality, AI21 Labs, Jurassic-X

Technologies / Libraries Mentioned: BERT, GPT-3, Jurassic-1, PaLM