Wide and Deep Learning: Custom Multi-Layer Perceptron from Scratch
Project Overview
This project explores the construction and training of multi-layer perceptrons (MLP) from scratch in Python, highlighting a hands-on approach to deep learning model architecture, interpretability, and extension. Unlike typical projects that depend entirely on frameworks like TensorFlow or PyTorch, this work implements essential deep learning logic using class-based architecture and low-level Python constructs, which ensures transparent model behavior and full control over each component.
Methodology
Key Steps and Innovations
- Manual Neural Network Construction:
Developed all layers, activation functions (ReLU, sigmoid, etc.), and loss calculations from first principles using Python classes, without abstracting away complexity via Keras/PyTorch modules. - Stepwise Implementation:
Built the network one step at a time, starting from forward propagation, loss calculation, and gradient computation to backpropagation and weight updates. - Custom Training Loop:
Designed an end-to-end training routine to support batch processing, mini-batching, epoch control, and performance logging. - Interpretability Focus:
Exposed intermediate gradients, activations, and weights for each epoch, supporting introspection and debugging. - Wide and Deep Integration:
Explored “wide” feature concatenation with “deep” neural architectures to benchmark classic logistic regression versus deeper learned representations.
Tools & Technologies
- Languages: Python
- Libraries: NumPy (core math operations); TensorFlow/Keras (for benchmarking against your implementation, not for core logic)
- Architecture: Custom OOP Python classes for all neural net elements
Key Achievements
- Full Transparency:
Every step in the training and prediction pipeline is explicit, allowing for debugging, learning, and experimentation not possible with high-level APIs alone. - Educational Value:
Provides a robust foundation for students and practitioners wanting to understand the mechanics of backpropagation, gradient descent, and the impact of architecture changes. - Performance Validation:
Benchmarked custom model results against Keras/TensorFlow MLPs. Comparable accuracy achieved on structured data tasks and validating the correctness of a custom implementation. - Extensibility:
Codebase is modular; future architectures (e.g., convolutional layers, custom activations) can be integrated without rewriting the core logic.
Example Results
- Custom MLP achieved competitive accuracy and loss performance on tabular datasets, matching high-level frameworks within a reasonable margin.
- Demonstrated clear learning curves, reproducible results, and interpretable gradient flows.
- Wide-and-deep hybridization led to measurable improvements over single-path (wide or deep alone) models for specific tasks.
GitHub Repository
Conclusion & Future Work
Constructing neural networks from the ground up deepens understanding of model mechanics, gradient flow, and optimization, which is insight rarely gained from high-level libraries alone. This project’s hands-on approach creates a strong base for exploring more complex architectures, experimenting with new modeling ideas, and building transparent, reproducible analytics for both practical and educational settings.