add README

2025-10-16 09:23:37 +02:00 · 2025-10-16 09:23:37 +02:00 · 621c9291ee
commit 621c9291ee
parent 7425a51871
1 changed files with 21 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -0,0 +1,21 @@
+# mnist
+This is a neural network trained to predict images of the [MNIST database](https://en.wikipedia.org/wiki/MNIST_database).
+It's written in python without making use of ML libraries directly, using [Apple's MLX](https://github.com/ml-explore/mlx) to accelerate matrix computations with the GPU.
+This makes the scripts very fast on Apple Silicon machines since MLX makes great use of the SoC shared memory and lazy evaluation for efficiency.
+The code is commented in detail and quite simple to understand with basic understanding of neural networks.
+
+## Implementation
+The network is structured to use 4 hidden layers of size 256x256. For activation of hidden layers [ReLU](https://en.wikipedia.org/wiki/Rectified_linear_unit) is used
+and loss is calculated with [cross-entropy](https://en.wikipedia.org/wiki/Cross-entropy). For optimization Adam is used (check the references section).
+
+## Running
+To run the scripts you first need to download the dataset and the dependencies with
+```sh
+./data.sh
+pip install -r requirements.txt
+```
+After that you can train the model by running the `train.py` script. To test the model you can run `test.py` instead. Accuracy should be around 97%.
+
+## References
+- [Adam optimization algorithm](https://arxiv.org/abs/1412.6980)
+- [Google ML crash course](https://developers.google.com/machine-learning/crash-course)