50 lines
3.7 KiB
Markdown
50 lines
3.7 KiB
Markdown
# AI-Feynman
|
|
|
|
This code is an improved implementation of AI Feynman: a Physics-Inspired Method for Symbolic Regression, Silviu-Marian Udrescu and Max Tegmark (2019) [[Science Advances](https://advances.sciencemag.org/content/6/16/eaay2631/tab-pdf)] and AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity, Udrescu S.M. et al. (2020) [[arXiv](https://arxiv.org/abs/2006.10782)].
|
|
|
|
Please check [[this Medium article](https://towardsdatascience.com/ai-feynman-2-0-learning-regression-equations-from-data-3232151bd929)] for a more detailed eplanation of how to get the code running.
|
|
|
|
In order to get started, run compile.sh to compile the fortran files used for the brute force code.
|
|
|
|
ai_feynman_example.py contains an example of running the code on some examples (found in the example_data directory). The examples correspond to the equations I.8.14, I.10.7 and I.50.26 in Table 4 in the paper. More data files on which the code can be tested on can be found in the [Feynman Symbolic Regression Database](https://space.mit.edu/home/tegmark/aifeynman.html).
|
|
|
|
The main function of the code, called by the user, has the following parameters:
|
|
|
|
* pathdir - path to the directory containing the data file
|
|
* filename - the name of the file containing the data
|
|
* BF_try_time - time limit for each brute force call (set by default to 60 seconds)
|
|
* BF_ops_file_type - file containing the symbols to be used in the brute force code (set by default to "14ops.txt")
|
|
* polyfit_deg - maximum degree of the polynomial tried by the polynomial fit routine (set be default to 4)
|
|
* NN_epochs - number of epochs for the training (set by default to 4000)
|
|
* vars_name - name of the variables appearing in the equation (inluding the name ofthe output variable). This should be passed as a list of strings, with the name of the variables appearing in the same order as they are in the file containing the data
|
|
* test_percentage - percentage of the input data to be kept aside and used as the test set
|
|
|
|
The data file to be analyzed should be a text file with each column containing the numerical values of each (dependent and independent) variable. The solution file will be saved in the directory called "results" under the name solution_{filename}. The solution file will contain several rows (corresponding to each point on the Pareto frontier), each row showing:
|
|
|
|
* the mean logarithm in based 2 of the error of the discovered equation applied to the input data (this can be though of as the average error in bits)
|
|
* the cummulative logarithm in based 2 of the error of the discovered equation applied to the input data (this can be though of as the cummulative error in bits)
|
|
* the complexity of the discovered equation (in bits)
|
|
* the error of the discovered equation applied to the input data
|
|
* the symbolic expression of the discovered equation
|
|
|
|
If test_percentage is different than zero, one more number is added in the beginning of each row, showing the error of the discovered equation on the test set.
|
|
|
|
ai_feynman_terminal_example.py allows calling the aiFeynman function from the command line.
|
|
(e.g. python ai_feynman_terminal_example.py --pathdir=../example_data/ --filename=example1.txt). Use python ai_feynman_terminal_example.py --help to display all the available parameters that can be passed to the function.
|
|
|
|
# Citation
|
|
|
|
If you compare with, build on, or use aspects of the AI Feynman work, please cite the following:
|
|
|
|
```
|
|
@article{udrescu2020ai,
|
|
title={AI Feynman: A physics-inspired method for symbolic regression},
|
|
author={Udrescu, Silviu-Marian and Tegmark, Max},
|
|
journal={Science Advances},
|
|
volume={6},
|
|
number={16},
|
|
pages={eaay2631},
|
|
year={2020},
|
|
publisher={American Association for the Advancement of Science}
|
|
}
|
|
```
|