Skip to content

AnirudhG07/EasyMathDataset

Repository files navigation

Easy Math Dataset

This repository contains easy set of problems from different field, with their proofs. These have been generated by GPT-4o model. The problems are pure mathematical statements instead of grade school math word problems or hard Olympiad level mathematical proofs.

Topics

The following are the Topics on which around 15 problems per topic have been generated. You can get a more detailed summary by running the emad summary command.

  • Linear Algebra
  • Algebra
  • Geometry
  • Calculus
  • Number Theory
  • Statistics
  • Trigonometry
  • Probability
  • Combinatorics
  • Logic
  • Set Theory
  • Graph Theory
  • Topology
  • Real Analysis
  • Differential Equations
  • Abstract Algebra
  • Group Theory
  • Number Theory
  • Complex Analysis
  • Vector Calculus

Note that the proofs of these problems in the dataset are yet to be Verified. Check out #1 for more details.

Significance

This dataset can be of great help to check how LLM's can convert the text proof to Lean4. Also these obvious statements can also be used to look into the performance of the LLM's in generating pure mathematical proofs. Sometimes they end up making silly errors for silly problems.

Usages

Extracting the problem

I have made a simple cli(called emad) to auto extract the problem from the dataset and print it in the terminal. So you dont need to hunt for it inside the json file.

You can run the below script in the terminal (I assume you have python)

git clone https://github.com/AnirudhG07/EasyMathDataset.git
cd EasyMathDataset
pip install .

This will install the cli in your system. Now you can run the below command to get the problem.

# For extracting the problem and proof.
eamd ex --[t]opic <topic> --[i]d <id>
# Example
emad ex -t "Linear Algebra" -i 1

# For summarisinf the dataset content.
emad summary

Please use the same name present in the summary in the ex command.

Contributing

It would be nice if you can help me in verifying the proofs of the problems. Please check out #1 for how you can help.

I am also hoping I can add correct Lean4 proofs to these problems as well in the dataset. As of this commit, it has not been done yet.

About

Dataset containing Easy mathematical statements and their proofs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages