A high-performance Support Vector Machine classifier implementation in C++ with a scikit-learn compatible API. This library provides a unified interface for SVM classification using both liblinear (for linear kernels) and libsvm (for non-linear kernels), with support for multiclass classification and PyTorch tensor integration.
Features
- ๐ Scikit-learn Compatible API: Familiar
fit()
, predict()
, predict_proba()
, score()
methods
- ๐ง Multiple Kernels: Linear, RBF, Polynomial, and Sigmoid kernels
- ๐ Multiclass Support: One-vs-Rest (OvR) and One-vs-One (OvO) strategies
- โก Automatic Library Selection: Uses liblinear for linear kernels, libsvm for others
- ๐ PyTorch Integration: Native support for libtorch tensors
- โ๏ธ JSON Configuration: Easy parameter management with nlohmann::json
- ๐งช Comprehensive Testing: 100% test coverage with Catch2
- ๐ Performance Metrics: Detailed evaluation and training metrics
- ๐ Cross-Validation: Built-in k-fold cross-validation support
- ๐ฏ Grid Search: Hyperparameter optimization capabilities
Quick Start
Prerequisites
- C++17 or later
- CMake 3.15+
- libtorch
- Git
Building
git clone <repository-url>
cd svm_classifier
mkdir build && cd build
cmake ..
make -j$(nproc)
Basic Usage
#include <svm_classifier/svm_classifier.hpp>
#include <torch/torch.h>
using namespace svm_classifier;
auto X = torch::randn({100, 2});
auto y = torch::randint(0, 3, {100});
auto metrics = svm.fit(X, y);
auto predictions = svm.predict(X);
auto probabilities = svm.predict_proba(X);
double accuracy = svm.score(X, y);
Support Vector Machine Classifier with scikit-learn compatible API.
JSON Configuration
#include <nlohmann/json.hpp>
nlohmann::json config = {
{"kernel", "rbf"},
{"C", 10.0},
{"gamma", 0.1},
{"multiclass_strategy", "ovo"},
{"probability", true}
};
API Reference
Constructor Options
SVMClassifier svm(KernelType::RBF, 1.0, MulticlassStrategy::ONE_VS_REST);
Core Methods
Method | Description | Returns |
fit(X, y) | Train the classifier | TrainingMetrics |
predict(X) | Predict class labels | torch::Tensor |
predict_proba(X) | Predict class probabilities | torch::Tensor |
score(X, y) | Calculate accuracy | double |
decision_function(X) | Get decision values | torch::Tensor |
cross_validate(X, y, cv) | K-fold cross-validation | std::vector<double> |
grid_search(X, y, grid, cv) | Hyperparameter tuning | nlohmann::json |
Parameter Configuration
Common Parameters
- kernel:
"linear"
, "rbf"
, "polynomial"
, "sigmoid"
- C: Regularization parameter (default: 1.0)
- multiclass_strategy:
"ovr"
(One-vs-Rest) or "ovo"
(One-vs-One)
- probability: Enable probability estimates (default: false)
- tolerance: Convergence tolerance (default: 1e-3)
Kernel-Specific Parameters
- RBF/Polynomial/Sigmoid:
gamma
(default: auto)
- Polynomial:
degree
(default: 3), coef0
(default: 0.0)
- Sigmoid:
coef0
(default: 0.0)
Examples
Multi-class Classification
auto X = torch::randn({300, 4});
auto y = torch::randint(0, 5, {300});
nlohmann::json config = {
{"kernel", "rbf"},
{"C", 1.0},
{"gamma", 0.1},
{"multiclass_strategy", "ovo"},
{"probability", true}
};
auto metrics = svm.
fit(X, y);
std::cout <<
"Accuracy: " << eval_metrics.
accuracy << std::endl;
std::cout << "F1-Score: " << eval_metrics.f1_score << std::endl;
EvaluationMetrics evaluate(const torch::Tensor &X, const torch::Tensor &y_true)
Calculate detailed evaluation metrics.
TrainingMetrics fit(const torch::Tensor &X, const torch::Tensor &y)
Train the SVM classifier.
double accuracy
Classification accuracy.
Cross-Validation
double mean_score = 0.0;
for (auto score : cv_scores) {
mean_score += score;
}
mean_score /= cv_scores.size();
std::vector< double > cross_validate(const torch::Tensor &X, const torch::Tensor &y, int cv=5)
Perform cross-validation.
Grid Search
nlohmann::json param_grid = {
{"C", {0.1, 1.0, 10.0}},
{"gamma", {0.01, 0.1, 1.0}},
{"kernel", {"rbf", "polynomial"}}
};
auto best_params = svm.
grid_search(X, y, param_grid, 3);
std::cout << "Best parameters: " << best_params.dump(2) << std::endl;
nlohmann::json grid_search(const torch::Tensor &X, const torch::Tensor &y, const nlohmann::json ¶m_grid, int cv=5)
Find optimal hyperparameters using grid search.
Testing
Run All Tests
Test Categories
make test_unit # Unit tests
make test_integration # Integration tests
make test_performance # Performance tests
Coverage Report
cmake -DCMAKE_BUILD_TYPE=Debug ..
make coverage
The coverage report will be generated in build/coverage_html/index.html
.
Project Structure
svm_classifier/
โโโ include/svm_classifier/ # Public headers
โ โโโ svm_classifier.hpp # Main classifier interface
โ โโโ data_converter.hpp # Tensor conversion utilities
โ โโโ multiclass_strategy.hpp # Multiclass strategies
โ โโโ kernel_parameters.hpp # Parameter management
โ โโโ types.hpp # Common types and enums
โโโ src/ # Implementation files
โโโ tests/ # Comprehensive test suite
โโโ examples/ # Usage examples
โโโ external/ # Third-party dependencies
โโโ CMakeLists.txt # Build configuration
Dependencies
Required
- libtorch: PyTorch C++ API for tensor operations
- liblinear: Linear SVM implementation
- libsvm: Non-linear SVM implementation
- nlohmann/json: JSON configuration handling
Testing
- Catch2: Testing framework
Build System
- CMake: Cross-platform build system
Performance Characteristics
Memory Usage
- Efficient sparse data handling
- Automatic memory management for SVM structures
- Configurable cache sizes for large datasets
Speed
- Linear kernels: Uses highly optimized liblinear
- Non-linear kernels: Uses proven libsvm implementation
- Multi-threading support via libtorch
Scalability
- Handles datasets from hundreds to millions of samples
- Memory-efficient data conversion
- Sparse feature support
Library Selection Logic
The classifier automatically selects the appropriate underlying library:
- Linear Kernel โ liblinear (optimized for linear classification)
- RBF/Polynomial/Sigmoid โ libsvm (supports arbitrary kernels)
This ensures optimal performance for each kernel type while maintaining a unified API.
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass:
make test_all
- Check code coverage:
make coverage
- Submit a pull request
Code Style
- Follow modern C++17 conventions
- Use RAII for resource management
- Comprehensive error handling
- Document all public APIs
License
[Specify your license here]
Acknowledgments
- libsvm: Chih-Chung Chang and Chih-Jen Lin
- liblinear: Fan et al.
- PyTorch: Facebook AI Research
- nlohmann/json: Niels Lohmann
- Catch2: Phil Nash and contributors