101 lines
3.5 KiB
Markdown
101 lines
3.5 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
PyClassifiers is a C++ library that provides wrappers for Python machine learning classifiers. It enables C++ applications to use Python-based ML algorithms (scikit-learn, XGBoost, custom implementations) through a unified interface.
|
|
|
|
## Essential Commands
|
|
|
|
### Build System
|
|
```bash
|
|
# Setup build configurations
|
|
make debug # Configure debug build with testing and coverage
|
|
make release # Configure release build
|
|
|
|
# Build targets
|
|
make buildd # Build debug version
|
|
make buildr # Build release version
|
|
|
|
# Testing
|
|
make test # Run all unit tests
|
|
make test opt="-s" # Run tests with verbose output
|
|
make test opt="-c='Test Name'" # Run specific test section
|
|
|
|
# Coverage
|
|
make coverage # Run tests and generate coverage report
|
|
|
|
# Installation
|
|
sudo make install # Install library to system (requires release build)
|
|
|
|
# Utilities
|
|
make clean # Clean test artifacts
|
|
make help # Show all available targets
|
|
```
|
|
|
|
### Dependencies
|
|
- Requires Conan package manager (`pip install conan`)
|
|
- Miniconda installation required for Python classifiers
|
|
- Boost library (preferably system package: `sudo dnf install boost-devel`)
|
|
|
|
## Architecture
|
|
|
|
### Core Components
|
|
|
|
**PyWrap** (`pyclfs/PyWrap.h`): Singleton managing Python interpreter lifecycle and thread-safe Python/C++ communication.
|
|
|
|
**PyClassifier** (`pyclfs/PyClassifier.h`): Abstract base class inheriting from `bayesnet::BaseClassifier`. All Python classifier wrappers extend this class.
|
|
|
|
**Individual Classifiers**: Each classifier (STree, ODTE, SVC, RandomForest, XGBoost, AdaBoostPy) wraps specific Python modules with consistent C++ interface.
|
|
|
|
### Data Flow
|
|
- Uses PyTorch tensors for efficient C++/Python data exchange
|
|
- JSON-based hyperparameter configuration
|
|
- Automatic memory management for Python objects
|
|
|
|
## Key Directories
|
|
|
|
- `pyclfs/` - Core library source code
|
|
- `tests/` - Catch2 unit tests with ARFF test datasets
|
|
- `build_debug/` - Debug build artifacts
|
|
- `build_release/` - Release build artifacts
|
|
- `cmake/modules/` - Custom CMake modules
|
|
|
|
## Development Patterns
|
|
|
|
### Adding New Classifiers
|
|
1. Inherit from `PyClassifier` base class
|
|
2. Implement required virtual methods: `fit()`, `predict()`, `predict_proba()`
|
|
3. Use `PyWrap::getInstance()` for Python interpreter access
|
|
4. Handle hyperparameters via JSON configuration
|
|
5. Add corresponding unit tests in `tests/TestPythonClassifiers.cc`
|
|
|
|
### Python Integration
|
|
- All Python interactions go through PyWrap singleton
|
|
- Use RAII pattern for Python object management
|
|
- Convert data using PyTorch tensors (discrete/continuous data support)
|
|
- Handle Python exceptions and convert to C++ exceptions
|
|
|
|
### Testing
|
|
- Catch2 framework with parameterized tests using GENERATE()
|
|
- Test data in ARFF format located in `tests/data/`
|
|
- Performance benchmarks validate expected accuracy scores
|
|
- Coverage reports generated with gcovr
|
|
|
|
## Important Files
|
|
|
|
- `pyclfs/PyWrap.h` - Python interpreter management
|
|
- `pyclfs/PyClassifier.h` - Base classifier interface
|
|
- `CMakeLists.txt` - Main build configuration
|
|
- `Makefile` - Build automation and common tasks
|
|
- `conanfile.py` - Package dependencies
|
|
- `tests/TestPythonClassifiers.cc` - Main test suite
|
|
|
|
## Technical Requirements
|
|
|
|
- C++17 standard compliance
|
|
- Python 3.11+ required
|
|
- Boost library with Python and NumPy support
|
|
- PyTorch for tensor operations
|
|
- Thread-safe design for concurrent usage |