3.5 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
PyClassifiers is a C++ library that provides wrappers for Python machine learning classifiers. It enables C++ applications to use Python-based ML algorithms (scikit-learn, XGBoost, custom implementations) through a unified interface.
Essential Commands
Build System
# Setup build configurations
make debug # Configure debug build with testing and coverage
make release # Configure release build
# Build targets
make buildd # Build debug version
make buildr # Build release version
# Testing
make test # Run all unit tests
make test opt="-s" # Run tests with verbose output
make test opt="-c='Test Name'" # Run specific test section
# Coverage
make coverage # Run tests and generate coverage report
# Installation
sudo make install # Install library to system (requires release build)
# Utilities
make clean # Clean test artifacts
make help # Show all available targets
Dependencies
- Requires Conan package manager (
pip install conan
) - Miniconda installation required for Python classifiers
- Boost library (preferably system package:
sudo dnf install boost-devel
)
Architecture
Core Components
PyWrap (pyclfs/PyWrap.h
): Singleton managing Python interpreter lifecycle and thread-safe Python/C++ communication.
PyClassifier (pyclfs/PyClassifier.h
): Abstract base class inheriting from bayesnet::BaseClassifier
. All Python classifier wrappers extend this class.
Individual Classifiers: Each classifier (STree, ODTE, SVC, RandomForest, XGBoost, AdaBoostPy) wraps specific Python modules with consistent C++ interface.
Data Flow
- Uses PyTorch tensors for efficient C++/Python data exchange
- JSON-based hyperparameter configuration
- Automatic memory management for Python objects
Key Directories
pyclfs/
- Core library source codetests/
- Catch2 unit tests with ARFF test datasetsbuild_debug/
- Debug build artifactsbuild_release/
- Release build artifactscmake/modules/
- Custom CMake modules
Development Patterns
Adding New Classifiers
- Inherit from
PyClassifier
base class - Implement required virtual methods:
fit()
,predict()
,predict_proba()
- Use
PyWrap::getInstance()
for Python interpreter access - Handle hyperparameters via JSON configuration
- Add corresponding unit tests in
tests/TestPythonClassifiers.cc
Python Integration
- All Python interactions go through PyWrap singleton
- Use RAII pattern for Python object management
- Convert data using PyTorch tensors (discrete/continuous data support)
- Handle Python exceptions and convert to C++ exceptions
Testing
- Catch2 framework with parameterized tests using GENERATE()
- Test data in ARFF format located in
tests/data/
- Performance benchmarks validate expected accuracy scores
- Coverage reports generated with gcovr
Important Files
pyclfs/PyWrap.h
- Python interpreter managementpyclfs/PyClassifier.h
- Base classifier interfaceCMakeLists.txt
- Main build configurationMakefile
- Build automation and common tasksconanfile.py
- Package dependenciestests/TestPythonClassifiers.cc
- Main test suite
Technical Requirements
- C++17 standard compliance
- Python 3.11+ required
- Boost library with Python and NumPy support
- PyTorch for tensor operations
- Thread-safe design for concurrent usage