A Simple Propositional Logic Theorem Prover

This challenge asks you to implement a basic theorem prover for propositional logic. Theorem provers are fundamental tools in computer science and mathematics, used to formally verify the correctness of programs and proofs. Building a simple prover will give you a practical understanding of logical reasoning and automated deduction.

Problem Description

You are to create a Python program that can determine the validity of a propositional logic formula given a set of axioms. The formula and axioms will be represented using a string-based notation. The prover should employ a resolution-based inference engine.

What needs to be achieved:

Parsing: Parse the formula and axioms into a suitable internal representation (e.g., a list of clauses). Clauses are disjunctions of literals. Literals are either variables (e.g., 'p', 'q', 'r') or their negations (e.g., '~p', '~q', '~r').
Resolution: Implement a resolution inference rule. The resolution rule takes two clauses and infers a new clause by removing a complementary literal pair (one positive and one negative literal with the same variable) from the two clauses.
Empty Clause Detection: The prover should terminate when it derives the empty clause (a clause with no literals), indicating that the original formula is refutable (and therefore, its negation is valid).
Termination: The prover should terminate if no new clauses can be generated after a certain number of resolution steps (to prevent infinite loops).

Key Requirements:

The input will consist of a formula and a set of axioms, all represented as strings.
The formula will be the negation of the statement you want to prove.
The axioms will be the known truths.
The output should be a boolean value: True if the formula is refutable (i.e., the original statement is valid), and False otherwise.
The program must handle variable names consisting of lowercase letters (a-z).
Negation is represented by '~' preceding the variable.
Disjunction (OR) is implicitly represented by the clause structure. Clauses are separated by newline characters.

Expected Behavior:

The program should take the formula and axioms as input, perform resolution, and return True if the empty clause is derived, and False otherwise.

Edge Cases to Consider:

Empty input (no axioms).
Formula already containing the empty clause.
Axioms already containing the empty clause.
Variables appearing in the formula and axioms but not in any clauses.
Duplicate clauses.
Clauses with the same literal appearing multiple times.

Examples

Example 1:

Input:
Formula:
~p
Axioms:
p
~p | q
Output:
True
Explanation:
The formula is ~p. The axioms are p, and ~p | q.  Resolution between ~p and p yields the empty clause. Therefore, the formula is refutable.

Example 2:

Input:
Formula:
~p | q
Axioms:
p
Output:
False
Explanation:
The formula is ~p | q. The axiom is p. Resolution between p and ~p | q yields q. No empty clause is derived, so the formula is not refutable.

Example 3: (Edge Case)

Input:
Formula:
~p
Axioms:
Output:
False
Explanation:
The formula is ~p. There are no axioms. No resolution steps can be performed, and the empty clause is not derived.

Constraints

Input Size: The total number of literals across the formula and axioms should not exceed 100.
Input Format: The formula and axioms must be provided as strings, with clauses separated by newline characters.
Resolution Steps: Limit the number of resolution steps to 1000 to prevent infinite loops.
Variable Names: Variable names must be single lowercase letters (a-z).
Memory Usage: Reasonable memory usage is expected. Avoid excessive data structures.

Notes

Consider using sets to efficiently store and manage clauses, eliminating duplicates.
The resolution rule can be implemented as a function that takes two clauses as input and returns the resulting clause (or None if no resolution is possible).
Think about how to represent literals internally (e.g., as tuples of (variable, is_negated)).
The order in which you apply the resolution rule can affect performance. Consider heuristics for choosing which clauses to resolve.
Focus on correctness first, then optimize for performance if necessary. A working, albeit slow, prover is better than a fast, incorrect one.
The goal is to demonstrate understanding of resolution and automated theorem proving, not to create a highly optimized or feature-rich prover.