#6 - Update tests and codecov conf

#6 - Add multiclass support
Removed (by now) predict_proba. Created a notebook in jupyter Added split_criteria parameter with min_distance and max_samples values Refactor _distances Refactor _split_criteria Refactor _reorder_results
2025-08-17 00:16:07 +00:00 · 2020-06-11 13:45:24 +02:00 · 2020-06-11 13:10:52 +02:00 · 2020-06-09 13:43:31 +02:00 · 2020-06-09 13:01:01 +02:00 · 2020-06-09 02:12:56 +02:00
30 changed files with 2630 additions and 936 deletions
--- a/.coveragerc
+++ b/.coveragerc
@@ -0,0 +1,14 @@
+[run]
+branch = True
+source = stree
+
+[report]
+exclude_lines =
+    if self.debug:
+    pragma: no cover
+    raise NotImplementedError
+    if __name__ == .__main__.:
+ignore_errors = True
+omit =
+    stree/tests/*
+    stree/__init__.py
--- a/.gitignore
+++ b/.gitignore
@@ -129,4 +129,5 @@ dmypy.json
 .pyre/

 .idea
-.vscode
+.vscode
+.pre-commit-config.yaml
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,13 +0,0 @@
-language: python
-os: linux
-dist: xenial
-install:
-  - pip install -r requirements.txt
-notifications:
-  email:
-    recipients:
-      - ricardo.montanana@alu.uclm.es
-    on_success: never # default: change
-    on_failure: always # default: always
-# command to run tests
-script: python -m unittest tests.Stree_test tests.Snode_test
--- a/README.md
+++ b/README.md
@@ -1,23 +1,43 @@
-[![Build Status](https://travis-ci.com/Doctorado-ML/STree.svg?branch=master)](https://travis-ci.com/Doctorado-ML/STree)
+[![Codeship Status for Doctorado-ML/STree](https://app.codeship.com/projects/8b2bd350-8a1b-0138-5f2c-3ad36f3eb318/status?branch=master)](https://app.codeship.com/projects/399170)
+[![codecov](https://codecov.io/gh/doctorado-ml/stree/branch/master/graph/badge.svg)](https://codecov.io/gh/doctorado-ml/stree)
+[![Codacy Badge](https://app.codacy.com/project/badge/Grade/35fa3dfd53a24a339344b33d9f9f2f3d)](https://www.codacy.com/gh/Doctorado-ML/STree?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=Doctorado-ML/STree&amp;utm_campaign=Badge_Grade)

 # Stree

-Oblique Tree classifier based on SVM nodes
+Oblique Tree classifier based on SVM nodes. The nodes are built and splitted with sklearn SVC models. Stree is a sklearn estimator and can be integrated in pipelines, grid searches, etc.

-## Example
+![Stree](https://raw.github.com/doctorado-ml/stree/master/example.png)

-### Jupyter
+## Installation

-[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/Doctorado-ML/STree/master?urlpath=lab/tree/test.ipynb)
+```bash
+pip install git+https://github.com/doctorado-ml/stree
+```
+
+## Examples
+
+### Jupyter notebooks
+
+* [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/Doctorado-ML/STree/master?urlpath=lab/tree/notebooks/benchmark.ipynb) Benchmark
+
+* [![Test](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Doctorado-ML/STree/blob/master/notebooks/benchmark.ipynb) Benchmark
+
+* [![Test2](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Doctorado-ML/STree/blob/master/notebooks/features.ipynb) Test features
+
+* [![Adaboost](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Doctorado-ML/STree/blob/master/notebooks/adaboost.ipynb) Adaboost
+
+* [![Gridsearch](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Doctorado-ML/STree/blob/master/notebooks/gridsearch.ipynb) Gridsearch
+
+* [![Test Graphics](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Doctorado-ML/STree/blob/master/notebooks/test_graphs.ipynb) Test Graphics

 ### Command line

-```python
+```bash
 python main.py
 ```

 ## Tests

-```python
-python -m unittest -v tests.Stree_test tests.Snode_test
+```bash
+python -m unittest -v stree.tests
 ```
--- a/codecov.yml
+++ b/codecov.yml
@@ -0,0 +1,12 @@
+overage:
+  status:
+    project:
+      default:
+        target: 90%
+comment:
+  layout: "reach, diff, flags, files"
+  behavior: default
+  require_changes: false  
+  require_base: yes
+  require_head: yes       
+  branches: null
--- a/data/.gitignore
+++ b/data/.gitignore
@@ -1,2 +0,0 @@
-*.csv
-*.txt
--- a/example.png
+++ b/example.png
--- a/main.py
+++ b/main.py
@@ -1,18 +1,30 @@
 import time
 from sklearn.model_selection import train_test_split
-from trees.Stree import Stree
+from stree import Stree
+
+random_state = 1

-random_state=1

 def load_creditcard(n_examples=0):
    import pandas as pd
    import numpy as np
    import random
-    df = pd.read_csv('data/creditcard.csv')
-    print("Fraud: {0:.3f}% {1}".format(df.Class[df.Class == 1].count()*100/df.shape[0], df.Class[df.Class == 1].count()))
-    print("Valid: {0:.3f}% {1}".format(df.Class[df.Class == 0].count()*100/df.shape[0], df.Class[df.Class == 0].count()))
+
+    df = pd.read_csv("data/creditcard.csv")
+    print(
+        "Fraud: {0:.3f}% {1}".format(
+            df.Class[df.Class == 1].count() * 100 / df.shape[0],
+            df.Class[df.Class == 1].count(),
+        )
+    )
+    print(
+        "Valid: {0:.3f}% {1}".format(
+            df.Class[df.Class == 0].count() * 100 / df.shape[0],
+            df.Class[df.Class == 0].count(),
+        )
+    )
    y = np.expand_dims(df.Class.values, axis=1)
-    X = df.drop(['Class', 'Time', 'Amount'], axis=1).values
+    X = df.drop(["Class", "Time", "Amount"], axis=1).values
    if n_examples > 0:
        # Take first n_examples samples
        X = X[:n_examples, :]
@@ -26,14 +38,30 @@ def load_creditcard(n_examples=0):
            X = np.append(Xt, X[indices], axis=0)
            y = np.append(yt, y[indices], axis=0)
    print("X.shape", X.shape, " y.shape", y.shape)
-    print("Fraud: {0:.3f}% {1}".format(len(y[y == 1])*100/X.shape[0], len(y[y == 1])))
-    print("Valid: {0:.3f}% {1}".format(len(y[y == 0]) * 100 / X.shape[0], len(y[y == 0])))
-    Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, train_size=0.7, shuffle=True, random_state=random_state, stratify=y)
+    print(
+        "Fraud: {0:.3f}% {1}".format(
+            len(y[y == 1]) * 100 / X.shape[0], len(y[y == 1])
+        )
+    )
+    print(
+        "Valid: {0:.3f}% {1}".format(
+            len(y[y == 0]) * 100 / X.shape[0], len(y[y == 0])
+        )
+    )
+    Xtrain, Xtest, ytrain, ytest = train_test_split(
+        X,
+        y,
+        train_size=0.7,
+        shuffle=True,
+        random_state=random_state,
+        stratify=y,
+    )
    return Xtrain, Xtest, ytrain, ytest

+
 # data = load_creditcard(-5000) # Take all true samples + 5000 of the others
 # data = load_creditcard(5000)  # Take the first 5000 samples
-data = load_creditcard() # Take all the samples
+data = load_creditcard()  # Take all the samples

 Xtrain = data[0]
 Xtest = data[1]
@@ -41,18 +69,9 @@ ytrain = data[2]
 ytest = data[3]

 now = time.time()
-clf = Stree(C=.01, random_state=random_state)
+clf = Stree(C=0.01, random_state=random_state)
 clf.fit(Xtrain, ytrain)
 print(f"Took {time.time() - now:.2f} seconds to train")
 print(clf)
 print(f"Classifier's accuracy (train): {clf.score(Xtrain, ytrain):.4f}")
 print(f"Classifier's accuracy (test) : {clf.score(Xtest, ytest):.4f}")
-proba = clf.predict_proba(Xtest)
-print("Checking that we have correct probabilities, these are probabilities of sample belonging to class 1")
-res0 = proba[proba[:, 0] == 0]
-res1 = proba[proba[:, 0] == 0]
-print("++++++++++res0++++++++++++")
-print(res0[res0[:, 1] > .8])
-print("**********res1************")
-print(res1[res1[:, 1] < .4])
-print(clf.predict_proba(Xtest))
--- a/notebooks/adaboost.ipynb
+++ b/notebooks/adaboost.ipynb
@@ -0,0 +1,232 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Test AdaBoost with different configurations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Setup\n",
+    "Uncomment the next cell if STree is not already installed"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#\n",
+    "# Google Colab setup\n",
+    "#\n",
+    "#!pip install git+https://github.com/doctorado-ml/stree"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import time\n",
+    "from sklearn.ensemble import AdaBoostClassifier\n",
+    "from sklearn.tree import DecisionTreeClassifier\n",
+    "from sklearn.svm import LinearSVC, SVC\n",
+    "from sklearn.model_selection import GridSearchCV, train_test_split\n",
+    "from sklearn.datasets import load_iris\n",
+    "from stree import Stree"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "if not os.path.isfile('data/creditcard.csv'):\n",
+    "    !wget --no-check-certificate --content-disposition http://nube.jccm.es/index.php/s/Zs7SYtZQJ3RQ2H2/download\n",
+    "    !tar xzf creditcard.tgz"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "Fraud: 0.173% 492\nValid: 99.827% 284315\nX.shape (100492, 28)  y.shape (100492,)\nFraud: 0.659% 662\nValid: 99.341% 99830\n"
+    }
+   ],
+   "source": [
+    "random_state=1\n",
+    "\n",
+    "def load_creditcard(n_examples=0):\n",
+    "    import pandas as pd\n",
+    "    import numpy as np\n",
+    "    import random\n",
+    "    df = pd.read_csv('data/creditcard.csv')\n",
+    "    print(\"Fraud: {0:.3f}% {1}\".format(df.Class[df.Class == 1].count()*100/df.shape[0], df.Class[df.Class == 1].count()))\n",
+    "    print(\"Valid: {0:.3f}% {1}\".format(df.Class[df.Class == 0].count()*100/df.shape[0], df.Class[df.Class == 0].count()))\n",
+    "    y = df.Class\n",
+    "    X = df.drop(['Class', 'Time', 'Amount'], axis=1).values\n",
+    "    if n_examples > 0:\n",
+    "        # Take first n_examples samples\n",
+    "        X = X[:n_examples, :]\n",
+    "        y = y[:n_examples, :]\n",
+    "    else:\n",
+    "        # Take all the positive samples with a number of random negatives\n",
+    "        if n_examples < 0:\n",
+    "            Xt = X[(y == 1).ravel()]\n",
+    "            yt = y[(y == 1).ravel()]\n",
+    "            indices = random.sample(range(X.shape[0]), -1 * n_examples)\n",
+    "            X = np.append(Xt, X[indices], axis=0)\n",
+    "            y = np.append(yt, y[indices], axis=0)\n",
+    "    print(\"X.shape\", X.shape, \" y.shape\", y.shape)\n",
+    "    print(\"Fraud: {0:.3f}% {1}\".format(len(y[y == 1])*100/X.shape[0], len(y[y == 1])))\n",
+    "    print(\"Valid: {0:.3f}% {1}\".format(len(y[y == 0]) * 100 / X.shape[0], len(y[y == 0])))\n",
+    "    Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, train_size=0.7, shuffle=True, random_state=random_state, stratify=y)\n",
+    "    return Xtrain, Xtest, ytrain, ytest\n",
+    "\n",
+    "# data = load_creditcard(-1000) # Take all true samples + 1000 of the others\n",
+    "# data = load_creditcard(5000)  # Take the first 5000 samples\n",
+    "# data = load_creditcard(0) # Take all the samples\n",
+    "data = load_creditcard(-100000)\n",
+    "\n",
+    "Xtrain = data[0]\n",
+    "Xtest = data[1]\n",
+    "ytrain = data[2]\n",
+    "ytest = data[3]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Tests"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## STree alone on the whole dataset and linear kernel"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "Score Train:  0.9985499829409757\nScore Test:  0.998407854584052\nTook 39.45 seconds\n"
+    }
+   ],
+   "source": [
+    "now = time.time()\n",
+    "clf = Stree(max_depth=3, random_state=random_state)\n",
+    "clf.fit(Xtrain, ytrain)\n",
+    "print(\"Score Train: \", clf.score(Xtrain, ytrain))\n",
+    "print(\"Score Test: \", clf.score(Xtest, ytest))\n",
+    "print(f\"Took {time.time() - now:.2f} seconds\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Different kernels with different configuations"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "n_estimators = 10\n",
+    "C = 7\n",
+    "max_depth = 3"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "Kernel: linear\tTime: 87.00 seconds\tScore Train: 0.9982372\tScore Test: 0.9981425\nKernel: rbf\tTime: 60.60 seconds\tScore Train: 0.9934181\tScore Test: 0.9933992\nKernel: poly\tTime: 88.08 seconds\tScore Train: 0.9937450\tScore Test: 0.9938968\n"
+    }
+   ],
+   "source": [
+    "for kernel in ['linear', 'rbf', 'poly']:\n",
+    "    now = time.time()\n",
+    "    clf = AdaBoostClassifier(Stree(C=7, kernel=kernel, max_depth=max_depth, random_state=random_state), n_estimators=n_estimators, random_state=random_state)\n",
+    "    clf.fit(Xtrain, ytrain)\n",
+    "    score_train = clf.score(Xtrain, ytrain)\n",
+    "    score_test = clf.score(Xtest, ytest)\n",
+    "    print(f\"Kernel: {kernel}\\tTime: {time.time() - now:.2f} seconds\\tScore Train: {score_train:.7f}\\tScore Test: {score_test:.7f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Test algorithm SAMME in AdaBoost to check speed/accuracy"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "Kernel: linear\tTime: 58.75 seconds\tScore Train: 0.9980524\tScore Test: 0.9978771\nKernel: rbf\tTime: 12.49 seconds\tScore Train: 0.9934181\tScore Test: 0.9933992\nKernel: poly\tTime: 97.85 seconds\tScore Train: 0.9972137\tScore Test: 0.9971806\n"
+    }
+   ],
+   "source": [
+    "for kernel in ['linear', 'rbf', 'poly']:\n",
+    "    now = time.time()\n",
+    "    clf = AdaBoostClassifier(Stree(C=7, kernel=kernel, max_depth=max_depth, random_state=random_state), n_estimators=n_estimators, random_state=random_state, algorithm=\"SAMME\")\n",
+    "    clf.fit(Xtrain, ytrain)\n",
+    "    score_train = clf.score(Xtrain, ytrain)\n",
+    "    score_test = clf.score(Xtest, ytest)\n",
+    "    print(f\"Kernel: {kernel}\\tTime: {time.time() - now:.2f} seconds\\tScore Train: {score_train:.7f}\\tScore Test: {score_test:.7f}\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.6-final"
+  },
+  "orig_nbformat": 2,
+  "kernelspec": {
+   "name": "python37664bitgeneralvenvfbd0a23e74cf4e778460f5ffc6761f39",
+   "display_name": "Python 3.7.6 64-bit ('general': venv)"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/notebooks/benchmark.ipynb
+++ b/notebooks/benchmark.ipynb
--- a/notebooks/features.ipynb
+++ b/notebooks/features.ipynb
@@ -0,0 +1,370 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Test smple_weight, kernels, C, sklearn estimator"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Setup\n",
+    "Uncomment the next cell if STree is not already installed"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#\n",
+    "# Google Colab setup\n",
+    "#\n",
+    "#!pip install git+https://github.com/doctorado-ml/stree"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "from sklearn.svm import SVC\n",
+    "from sklearn.tree import DecisionTreeClassifier\n",
+    "from sklearn.utils.estimator_checks import check_estimator\n",
+    "from sklearn.datasets import make_classification, load_iris, load_wine\n",
+    "from sklearn.model_selection import train_test_split\n",
+    "from stree import Stree\n",
+    "import time"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "if not os.path.isfile('data/creditcard.csv'):\n",
+    "    !wget --no-check-certificate --content-disposition http://nube.jccm.es/index.php/s/Zs7SYtZQJ3RQ2H2/download\n",
+    "    !tar xzf creditcard.tgz"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "Fraud: 0.173% 492\nValid: 99.827% 284315\nX.shape (1492, 28)  y.shape (1492,)\nFraud: 33.110% 494\nValid: 66.890% 998\n"
+    }
+   ],
+   "source": [
+    "random_state=1\n",
+    "\n",
+    "def load_creditcard(n_examples=0):\n",
+    "    import pandas as pd\n",
+    "    import numpy as np\n",
+    "    import random\n",
+    "    df = pd.read_csv('data/creditcard.csv')\n",
+    "    print(\"Fraud: {0:.3f}% {1}\".format(df.Class[df.Class == 1].count()*100/df.shape[0], df.Class[df.Class == 1].count()))\n",
+    "    print(\"Valid: {0:.3f}% {1}\".format(df.Class[df.Class == 0].count()*100/df.shape[0], df.Class[df.Class == 0].count()))\n",
+    "    y = df.Class\n",
+    "    X = df.drop(['Class', 'Time', 'Amount'], axis=1).values\n",
+    "    if n_examples > 0:\n",
+    "        # Take first n_examples samples\n",
+    "        X = X[:n_examples, :]\n",
+    "        y = y[:n_examples, :]\n",
+    "    else:\n",
+    "        # Take all the positive samples with a number of random negatives\n",
+    "        if n_examples < 0:\n",
+    "            Xt = X[(y == 1).ravel()]\n",
+    "            yt = y[(y == 1).ravel()]\n",
+    "            indices = random.sample(range(X.shape[0]), -1 * n_examples)\n",
+    "            X = np.append(Xt, X[indices], axis=0)\n",
+    "            y = np.append(yt, y[indices], axis=0)\n",
+    "    print(\"X.shape\", X.shape, \" y.shape\", y.shape)\n",
+    "    print(\"Fraud: {0:.3f}% {1}\".format(len(y[y == 1])*100/X.shape[0], len(y[y == 1])))\n",
+    "    print(\"Valid: {0:.3f}% {1}\".format(len(y[y == 0]) * 100 / X.shape[0], len(y[y == 0])))\n",
+    "    Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, train_size=0.7, shuffle=True, random_state=random_state, stratify=y)\n",
+    "    return Xtrain, Xtest, ytrain, ytest\n",
+    "\n",
+    "# data = load_creditcard(-5000) # Take all true samples + 5000 of the others\n",
+    "# data = load_creditcard(5000)  # Take the first 5000 samples\n",
+    "data = load_creditcard(-1000) # Take all the samples\n",
+    "\n",
+    "Xtrain = data[0]\n",
+    "Xtest = data[1]\n",
+    "ytrain = data[2]\n",
+    "ytest = data[3]\n",
+    "# Set weights inverse to its count class in dataset\n",
+    "weights = np.ones(Xtrain.shape[0],) * 1.00244\n",
+    "weights[ytrain==1] = 1.99755\n",
+    "weights_test = np.ones(Xtest.shape[0],) * 1.00244\n",
+    "weights_test[ytest==1] = 1.99755 "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Tests"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Test smple_weights\n",
+    "Compute accuracy with weights in samples. The weights are set based on the inverse of the number of samples of each class"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "Accuracy of Train without weights 0.9789272030651341\nAccuracy of Train with    weights 0.9952107279693486\nAccuracy of Tests without weights 0.9598214285714286\nAccuracy of Tests with    weights 0.9508928571428571\n"
+    }
+   ],
+   "source": [
+    "C = 23\n",
+    "print(\"Accuracy of Train without weights\", Stree(C=C, random_state=1).fit(Xtrain, ytrain).score(Xtrain, ytrain))\n",
+    "print(\"Accuracy of Train with    weights\", Stree(C=C, random_state=1).fit(Xtrain, ytrain, sample_weight=weights).score(Xtrain, ytrain))\n",
+    "print(\"Accuracy of Tests without weights\", Stree(C=C, random_state=1).fit(Xtrain, ytrain).score(Xtest, ytest))\n",
+    "print(\"Accuracy of Tests with    weights\", Stree(C=C, random_state=1).fit(Xtrain, ytrain, sample_weight=weights).score(Xtest, ytest))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Test accuracy with different kernels\n",
+    "Compute accuracy on train and test set with default hyperparmeters of every kernel"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "Time: 0.27s\tKernel: linear\tAccuracy_train: 0.9683908045977011\tAccuracy_test: 0.953125\nTime: 0.09s\tKernel: rbf\tAccuracy_train: 0.9875478927203065\tAccuracy_test: 0.9598214285714286\nTime: 0.06s\tKernel: poly\tAccuracy_train: 0.9885057471264368\tAccuracy_test: 0.9464285714285714\n"
+    }
+   ],
+   "source": [
+    "random_state=1\n",
+    "for kernel in ['linear', 'rbf', 'poly']:\n",
+    "    now = time.time()\n",
+    "    clf = Stree(C=7, kernel=kernel, random_state=random_state).fit(Xtrain, ytrain)\n",
+    "    accuracy_train = clf.score(Xtrain, ytrain)\n",
+    "    accuracy_test = clf.score(Xtest, ytest)\n",
+    "    time_spent = time.time() - now\n",
+    "    print(f\"Time: {time_spent:.2f}s\\tKernel: {kernel}\\tAccuracy_train: {accuracy_train}\\tAccuracy_test: {accuracy_test}\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Test diferent values of C"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "tags": [
+     "outputPrepend"
+    ]
+   },
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "************** C=0.001 ****************************\nClassifier's accuracy (train): 0.9531\nClassifier's accuracy (test) : 0.9621\nroot\nroot - Down, <cgaf> - Leaf class=1 belief= 0.983713 counts=(array([0, 1]), array([  5, 302]))\nroot - Up, <cgaf> - Leaf class=0 belief= 0.940299 counts=(array([0, 1]), array([693,  44]))\n\n**************************************************\n************** C=0.01 ****************************\nClassifier's accuracy (train): 0.9569\nClassifier's accuracy (test) : 0.9621\nroot\nroot - Down, <cgaf> - Leaf class=1 belief= 0.990228 counts=(array([0, 1]), array([  3, 304]))\nroot - Up, <cgaf> - Leaf class=0 belief= 0.943012 counts=(array([0, 1]), array([695,  42]))\n\n**************************************************\n************** C=1 ****************************\nClassifier's accuracy (train): 0.9655\nClassifier's accuracy (test) : 0.9643\nroot\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([310]))\nroot - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([5]))\nroot - Up, <cgaf> - Leaf class=0 belief= 0.950617 counts=(array([0, 1]), array([693,  36]))\n\n**************************************************\n************** C=5 ****************************\nClassifier's accuracy (train): 0.9684\nClassifier's accuracy (test) : 0.9598\nroot\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([311]))\nroot - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([8]))\nroot - Up\nroot - Up - Down\nroot - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([1]))\nroot - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([2]))\nroot - Up - Up\nroot - Up - Up - Down, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([2]))\nroot - Up - Up - Up\nroot - Up - Up - Up - Down\nroot - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([1]))\nroot - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([1]))\nroot - Up - Up - Up - Up, <cgaf> - Leaf class=0 belief= 0.954039 counts=(array([0, 1]), array([685,  33]))\n\n**************************************************\n************** C=17 ****************************\nClassifier's accuracy (train): 0.9751\nClassifier's accuracy (test) : 0.9464\nroot\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([304]))\nroot - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([8]))\nroot - Up\nroot - Up - Down\nroot - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([4]))\nroot - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([3]))\nroot - Up - Up\nroot - Up - Up - Down\nroot - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([4]))\nroot - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([2]))\nroot - Up - Up - Up\nroot - Up - Up - Up - Down\nroot - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([3]))\nroot - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([1]))\nroot - Up - Up - Up - Up\nroot - Up - Up - Up - Up - Down\nroot - Up - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([3]))\nroot - Up - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([3]))\nroot - Up - Up - Up - Up - Up\nroot - Up - Up - Up - Up - Up - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([2]))\nroot - Up - Up - Up - Up - Up - Up, <cgaf> - Leaf class=0 belief= 0.963225 counts=(array([0, 1]), array([681,  26]))\n\n**************************************************\n0.6869 secs\n"
+    }
+   ],
+   "source": [
+    "t = time.time()\n",
+    "for C in (.001, .01, 1, 5, 17):\n",
+    "    clf = Stree(C=C, kernel=\"linear\", random_state=random_state)\n",
+    "    clf.fit(Xtrain, ytrain)\n",
+    "    print(f\"************** C={C} ****************************\")\n",
+    "    print(f\"Classifier's accuracy (train): {clf.score(Xtrain, ytrain):.4f}\")\n",
+    "    print(f\"Classifier's accuracy (test) : {clf.score(Xtest, ytest):.4f}\")\n",
+    "    print(clf)\n",
+    "    print(f\"**************************************************\")\n",
+    "print(f\"{time.time() - t:.4f} secs\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Test iterator\n",
+    "Check different weays of using the iterator"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "root\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([304]))\nroot - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([8]))\nroot - Up\nroot - Up - Down\nroot - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([4]))\nroot - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([3]))\nroot - Up - Up\nroot - Up - Up - Down\nroot - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([4]))\nroot - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([2]))\nroot - Up - Up - Up\nroot - Up - Up - Up - Down\nroot - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([3]))\nroot - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([1]))\nroot - Up - Up - Up - Up\nroot - Up - Up - Up - Up - Down\nroot - Up - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([3]))\nroot - Up - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([3]))\nroot - Up - Up - Up - Up - Up\nroot - Up - Up - Up - Up - Up - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([2]))\nroot - Up - Up - Up - Up - Up - Up, <cgaf> - Leaf class=0 belief= 0.963225 counts=(array([0, 1]), array([681,  26]))\n"
+    }
+   ],
+   "source": [
+    "#check iterator\n",
+    "for i in list(clf):\n",
+    "    print(i)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "root\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([304]))\nroot - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([8]))\nroot - Up\nroot - Up - Down\nroot - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([4]))\nroot - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([3]))\nroot - Up - Up\nroot - Up - Up - Down\nroot - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([4]))\nroot - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([2]))\nroot - Up - Up - Up\nroot - Up - Up - Up - Down\nroot - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([3]))\nroot - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([1]))\nroot - Up - Up - Up - Up\nroot - Up - Up - Up - Up - Down\nroot - Up - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([3]))\nroot - Up - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([3]))\nroot - Up - Up - Up - Up - Up\nroot - Up - Up - Up - Up - Up - Down, <pure> - Leaf class=1 belief= 1.000000 counts=(array([1]), array([2]))\nroot - Up - Up - Up - Up - Up - Up, <cgaf> - Leaf class=0 belief= 0.963225 counts=(array([0, 1]), array([681,  26]))\n"
+    }
+   ],
+   "source": [
+    "#check iterator again\n",
+    "for i in clf:\n",
+    "    print(i)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Test STree is a sklearn estimator"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "1 functools.partial(<function check_no_attributes_set_in_init at 0x1254f13b0>, 'Stree')\n2 functools.partial(<function check_estimators_dtypes at 0x1254e84d0>, 'Stree')\n3 functools.partial(<function check_fit_score_takes_y at 0x1254e83b0>, 'Stree')\n4 functools.partial(<function check_sample_weights_pandas_series at 0x1254e0cb0>, 'Stree')\n5 functools.partial(<function check_sample_weights_not_an_array at 0x1254e0dd0>, 'Stree')\n6 functools.partial(<function check_sample_weights_list at 0x1254e0ef0>, 'Stree')\n7 functools.partial(<function check_sample_weights_shape at 0x1254e2050>, 'Stree')\n8 functools.partial(<function check_sample_weights_invariance at 0x1254e2170>, 'Stree')\n9 functools.partial(<function check_estimators_fit_returns_self at 0x1254eb4d0>, 'Stree')\n10 functools.partial(<function check_estimators_fit_returns_self at 0x1254eb4d0>, 'Stree', readonly_memmap=True)\n11 functools.partial(<function check_complex_data at 0x1254e2320>, 'Stree')\n12 functools.partial(<function check_dtype_object at 0x1254e2290>, 'Stree')\n13 functools.partial(<function check_estimators_empty_data_messages at 0x1254e85f0>, 'Stree')\n14 functools.partial(<function check_pipeline_consistency at 0x1254e8290>, 'Stree')\n15 functools.partial(<function check_estimators_nan_inf at 0x1254e8710>, 'Stree')\n16 functools.partial(<function check_estimators_overwrite_params at 0x1254f1290>, 'Stree')\n17 functools.partial(<function check_estimator_sparse_data at 0x1254e0b90>, 'Stree')\n18 functools.partial(<function check_estimators_pickle at 0x1254e8950>, 'Stree')\n19 functools.partial(<function check_classifier_data_not_an_array at 0x1254f15f0>, 'Stree')\n20 functools.partial(<function check_classifiers_one_label at 0x1254eb050>, 'Stree')\n21 functools.partial(<function check_classifiers_classes at 0x1254eba70>, 'Stree')\n22 functools.partial(<function check_estimators_partial_fit_n_features at 0x1254e8a70>, 'Stree')\n23 functools.partial(<function check_classifiers_train at 0x1254eb170>, 'Stree')\n24 functools.partial(<function check_classifiers_train at 0x1254eb170>, 'Stree', readonly_memmap=True)\n25 functools.partial(<function check_classifiers_train at 0x1254eb170>, 'Stree', readonly_memmap=True, X_dtype='float32')\n26 functools.partial(<function check_classifiers_regression_target at 0x1254f40e0>, 'Stree')\n27 functools.partial(<function check_supervised_y_no_nan at 0x1254da9e0>, 'Stree')\n28 functools.partial(<function check_supervised_y_2d at 0x1254eb710>, 'Stree')\n29 functools.partial(<function check_estimators_unfitted at 0x1254eb5f0>, 'Stree')\n30 functools.partial(<function check_non_transformer_estimators_n_iter at 0x1254f1c20>, 'Stree')\n31 functools.partial(<function check_decision_proba_consistency at 0x1254f4200>, 'Stree')\n32 functools.partial(<function check_fit2d_predict1d at 0x1254e2830>, 'Stree')\n33 functools.partial(<function check_methods_subset_invariance at 0x1254e29e0>, 'Stree')\n34 functools.partial(<function check_fit2d_1sample at 0x1254e2b00>, 'Stree')\n35 functools.partial(<function check_fit2d_1feature at 0x1254e2c20>, 'Stree')\n36 functools.partial(<function check_fit1d at 0x1254e2d40>, 'Stree')\n37 functools.partial(<function check_get_params_invariance at 0x1254f1e60>, 'Stree')\n38 functools.partial(<function check_set_params at 0x1254f1f80>, 'Stree')\n39 functools.partial(<function check_dict_unchanged at 0x1254e2440>, 'Stree')\n40 functools.partial(<function check_dont_overwrite_parameters at 0x1254e2710>, 'Stree')\n41 functools.partial(<function check_fit_idempotent at 0x1254f43b0>, 'Stree')\n42 functools.partial(<function check_n_features_in at 0x1254f4440>, 'Stree')\n43 functools.partial(<function check_requires_y_none at 0x1254f44d0>, 'Stree')\n"
+    }
+   ],
+   "source": [
+    "# Make checks one by one\n",
+    "c = 0\n",
+    "checks = check_estimator(Stree(), generate_only=True)\n",
+    "for check in checks:\n",
+    "    c += 1\n",
+    "    print(c, check[1])\n",
+    "    check[1](check[0])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Check if the classifier is a sklearn estimator\n",
+    "check_estimator(Stree())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Compare to SVM"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "== Not Weighted ===\nSVC train score ..: 0.9521072796934866\nSTree train score : 0.9578544061302682\nSVC test score ...: 0.9553571428571429\nSTree test score .: 0.9575892857142857\n==== Weighted =====\nSVC train score ..: 0.9616858237547893\nSTree train score : 0.9616858237547893\nSVC test score ...: 0.9642857142857143\nSTree test score .: 0.9598214285714286\n*SVC test score ..: 0.951413553411694\n*STree test score : 0.9480517444389333\n"
+    }
+   ],
+   "source": [
+    "svc = SVC(C=7, kernel='rbf', gamma=.001, random_state=random_state)\n",
+    "clf = Stree(C=17, kernel='rbf', gamma=.001, random_state=random_state)\n",
+    "svc.fit(Xtrain, ytrain)\n",
+    "clf.fit(Xtrain, ytrain)\n",
+    "print(\"== Not Weighted ===\")\n",
+    "print(\"SVC train score ..:\", svc.score(Xtrain, ytrain))\n",
+    "print(\"STree train score :\", clf.score(Xtrain, ytrain))\n",
+    "print(\"SVC test score ...:\", svc.score(Xtest, ytest))\n",
+    "print(\"STree test score .:\", clf.score(Xtest, ytest))\n",
+    "svc.fit(Xtrain, ytrain, weights)\n",
+    "clf.fit(Xtrain, ytrain, weights)\n",
+    "print(\"==== Weighted =====\")\n",
+    "print(\"SVC train score ..:\", svc.score(Xtrain, ytrain))\n",
+    "print(\"STree train score :\", clf.score(Xtrain, ytrain))\n",
+    "print(\"SVC test score ...:\", svc.score(Xtest, ytest))\n",
+    "print(\"STree test score .:\", clf.score(Xtest, ytest))\n",
+    "print(\"*SVC test score ..:\", svc.score(Xtest, ytest, weights_test))\n",
+    "print(\"*STree test score :\", clf.score(Xtest, ytest, weights_test))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": "root\nroot - Down\nroot - Down - Down, <cgaf> - Leaf class=1 belief= 0.969325 counts=(array([0, 1]), array([ 10, 316]))\nroot - Down - Up, <pure> - Leaf class=0 belief= 1.000000 counts=(array([0]), array([1]))\nroot - Up, <cgaf> - Leaf class=0 belief= 0.958159 counts=(array([0, 1]), array([687,  30]))\n\n"
+    }
+   ],
+   "source": [
+    "print(clf)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.7.6 64-bit ('general': venv)",
+   "language": "python",
+   "name": "python37664bitgeneralvenvfbd0a23e74cf4e778460f5ffc6761f39"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.6-final"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/notebooks/gridsearch.ipynb
+++ b/notebooks/gridsearch.ipynb
@@ -0,0 +1,244 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# Test Gridsearch\n",
+        "with different kernels and different configurations"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# Setup\n",
+        "Uncomment the next cell if STree is not already installed"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "#\n",
+        "# Google Colab setup\n",
+        "#\n",
+        "#!pip install git+https://github.com/doctorado-ml/stree"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "zIHKVxthDZEa",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "from sklearn.ensemble import AdaBoostClassifier\n",
+        "from sklearn.svm import LinearSVC\n",
+        "from sklearn.model_selection import GridSearchCV, train_test_split\n",
+        "from stree import Stree"
+      ],
+      "execution_count": 2,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "IEmq50QgDZEi",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "import os\n",
+        "if not os.path.isfile('data/creditcard.csv'):\n",
+        "    !wget --no-check-certificate --content-disposition http://nube.jccm.es/index.php/s/Zs7SYtZQJ3RQ2H2/download\n",
+        "    !tar xzf creditcard.tgz"
+      ],
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "z9Q-YUfBDZEq",
+        "colab_type": "code",
+        "colab": {},
+        "outputId": "afc822fb-f16a-4302-8a67-2b9e2880159b"
+      },
+      "source": [
+        "random_state=1\n",
+        "\n",
+        "def load_creditcard(n_examples=0):\n",
+        "    import pandas as pd\n",
+        "    import numpy as np\n",
+        "    import random\n",
+        "    df = pd.read_csv('data/creditcard.csv')\n",
+        "    print(\"Fraud: {0:.3f}% {1}\".format(df.Class[df.Class == 1].count()*100/df.shape[0], df.Class[df.Class == 1].count()))\n",
+        "    print(\"Valid: {0:.3f}% {1}\".format(df.Class[df.Class == 0].count()*100/df.shape[0], df.Class[df.Class == 0].count()))\n",
+        "    y = df.Class\n",
+        "    X = df.drop(['Class', 'Time', 'Amount'], axis=1).values\n",
+        "    if n_examples > 0:\n",
+        "        # Take first n_examples samples\n",
+        "        X = X[:n_examples, :]\n",
+        "        y = y[:n_examples, :]\n",
+        "    else:\n",
+        "        # Take all the positive samples with a number of random negatives\n",
+        "        if n_examples < 0:\n",
+        "            Xt = X[(y == 1).ravel()]\n",
+        "            yt = y[(y == 1).ravel()]\n",
+        "            indices = random.sample(range(X.shape[0]), -1 * n_examples)\n",
+        "            X = np.append(Xt, X[indices], axis=0)\n",
+        "            y = np.append(yt, y[indices], axis=0)\n",
+        "    print(\"X.shape\", X.shape, \" y.shape\", y.shape)\n",
+        "    print(\"Fraud: {0:.3f}% {1}\".format(len(y[y == 1])*100/X.shape[0], len(y[y == 1])))\n",
+        "    print(\"Valid: {0:.3f}% {1}\".format(len(y[y == 0]) * 100 / X.shape[0], len(y[y == 0])))\n",
+        "    Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, train_size=0.7, shuffle=True, random_state=random_state, stratify=y)\n",
+        "    return Xtrain, Xtest, ytrain, ytest\n",
+        "\n",
+        "data = load_creditcard(-1000) # Take all true samples + 1000 of the others\n",
+        "# data = load_creditcard(5000)  # Take the first 5000 samples\n",
+        "# data = load_creditcard(0) # Take all the samples\n",
+        "\n",
+        "Xtrain = data[0]\n",
+        "Xtest = data[1]\n",
+        "ytrain = data[2]\n",
+        "ytest = data[3]"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": "Fraud: 0.173% 492\nValid: 99.827% 284315\nX.shape (1492, 28)  y.shape (1492,)\nFraud: 33.244% 496\nValid: 66.756% 996\n"
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# Tests"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "HmX3kR4PDZEw",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "parameters = {\n",
+        "    'base_estimator': [Stree()],\n",
+        "    'n_estimators': [10, 25],\n",
+        "    'learning_rate': [.5, 1],\n",
+        "    'base_estimator__tol': [.1,  1e-02],\n",
+        "    'base_estimator__max_depth': [3, 5],\n",
+        "    'base_estimator__C': [1, 3],\n",
+        "    'base_estimator__kernel': ['linear', 'poly', 'rbf']\n",
+        "}"
+      ],
+      "execution_count": 9,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {},
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": "{'C': 1.0,\n 'degree': 3,\n 'gamma': 'scale',\n 'kernel': 'linear',\n 'max_depth': None,\n 'max_iter': 1000,\n 'min_samples_split': 0,\n 'random_state': None,\n 'tol': 0.0001}"
+          },
+          "metadata": {},
+          "execution_count": 14
+        }
+      ],
+      "source": [
+        "Stree().get_params()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "CrcB8o6EDZE5",
+        "colab_type": "code",
+        "colab": {},
+        "outputId": "7703413a-d563-4289-a13b-532f38f82762"
+      },
+      "source": [
+        "random_state=2020\n",
+        "clf = AdaBoostClassifier(random_state=random_state)\n",
+        "grid = GridSearchCV(clf, parameters, verbose=10, n_jobs=-1, return_train_score=True)\n",
+        "grid.fit(Xtrain, ytrain)"
+      ],
+      "execution_count": 11,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": "Fitting 5 folds for each of 96 candidates, totalling 480 fits\n[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.\n[Parallel(n_jobs=-1)]: Done   2 tasks      | elapsed:    3.6s\n[Parallel(n_jobs=-1)]: Done   9 tasks      | elapsed:    4.2s\n[Parallel(n_jobs=-1)]: Done  16 tasks      | elapsed:    4.8s\n[Parallel(n_jobs=-1)]: Done  25 tasks      | elapsed:    5.3s\n[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:    6.2s\n[Parallel(n_jobs=-1)]: Done  45 tasks      | elapsed:    7.2s\n[Parallel(n_jobs=-1)]: Done  56 tasks      | elapsed:    8.9s\n[Parallel(n_jobs=-1)]: Done  69 tasks      | elapsed:   10.7s\n[Parallel(n_jobs=-1)]: Done  82 tasks      | elapsed:   12.7s\n[Parallel(n_jobs=-1)]: Done  97 tasks      | elapsed:   16.7s\n[Parallel(n_jobs=-1)]: Done 112 tasks      | elapsed:   19.4s\n[Parallel(n_jobs=-1)]: Done 129 tasks      | elapsed:   24.4s\n[Parallel(n_jobs=-1)]: Done 146 tasks      | elapsed:   29.3s\n[Parallel(n_jobs=-1)]: Done 165 tasks      | elapsed:   32.7s\n[Parallel(n_jobs=-1)]: Done 184 tasks      | elapsed:   36.4s\n[Parallel(n_jobs=-1)]: Done 205 tasks      | elapsed:   39.7s\n[Parallel(n_jobs=-1)]: Done 226 tasks      | elapsed:   43.7s\n[Parallel(n_jobs=-1)]: Done 249 tasks      | elapsed:   46.6s\n[Parallel(n_jobs=-1)]: Done 272 tasks      | elapsed:   48.8s\n[Parallel(n_jobs=-1)]: Done 297 tasks      | elapsed:   52.0s\n[Parallel(n_jobs=-1)]: Done 322 tasks      | elapsed:   55.9s\n[Parallel(n_jobs=-1)]: Done 349 tasks      | elapsed:  1.0min\n[Parallel(n_jobs=-1)]: Done 376 tasks      | elapsed:  1.2min\n[Parallel(n_jobs=-1)]: Done 405 tasks      | elapsed:  1.3min\n[Parallel(n_jobs=-1)]: Done 434 tasks      | elapsed:  1.3min\n[Parallel(n_jobs=-1)]: Done 465 tasks      | elapsed:  1.4min\n[Parallel(n_jobs=-1)]: Done 480 out of 480 | elapsed:  1.5min finished\n"
+        },
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": "GridSearchCV(estimator=AdaBoostClassifier(random_state=2020), n_jobs=-1,\n             param_grid={'base_estimator': [Stree(C=1, max_depth=3, tol=0.1)],\n                         'base_estimator__C': [1, 3],\n                         'base_estimator__kernel': ['linear', 'poly', 'rbf'],\n                         'base_estimator__max_depth': [3, 5],\n                         'base_estimator__tol': [0.1, 0.01],\n                         'learning_rate': [0.5, 1], 'n_estimators': [10, 25]},\n             return_train_score=True, verbose=10)"
+          },
+          "metadata": {},
+          "execution_count": 11
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "ZjX88NoYDZE8",
+        "colab_type": "code",
+        "colab": {},
+        "outputId": "285163c8-fa33-4915-8ae7-61c4f7844344"
+      },
+      "source": [
+        "print(\"Best estimator: \", grid.best_estimator_)\n",
+        "print(\"Best hyperparameters: \", grid.best_params_)\n",
+        "print(\"Best accuracy: \", grid.best_score_)"
+      ],
+      "execution_count": 16,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": "Best estimator:  AdaBoostClassifier(base_estimator=Stree(C=1, max_depth=3, tol=0.1),\n                   learning_rate=0.5, n_estimators=10, random_state=2020)\nBest hyperparameters:  {'base_estimator': Stree(C=1, max_depth=3, tol=0.1), 'base_estimator__C': 1, 'base_estimator__kernel': 'linear', 'base_estimator__max_depth': 3, 'base_estimator__tol': 0.1, 'learning_rate': 0.5, 'n_estimators': 10}\nBest accuracy:  0.9492316893632683\n"
+        }
+      ]
+    }
+  ],
+  "metadata": {
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.7.6-final"
+    },
+    "orig_nbformat": 2,
+    "kernelspec": {
+      "name": "python37664bitgeneralvenvfbd0a23e74cf4e778460f5ffc6761f39",
+      "display_name": "Python 3.7.6 64-bit ('general': venv)"
+    },
+    "colab": {
+      "name": "gridsearch.ipynb",
+      "provenance": []
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
--- a/notebooks/test_graphs.ipynb
+++ b/notebooks/test_graphs.ipynb
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,16 @@
+[tool.black]
+line-length = 79
+include = '\.pyi?$'
+exclude = '''
+/(
+    \.git
+  | \.hg
+  | \.mypy_cache
+  | \.tox
+  | \.venv
+  | _build
+  | buck-out
+  | build
+  | dist
+)/
+'''
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,5 @@
-numpy==1.18.2
-scikit-learn==0.22.2
-pandas==1.0.3
+numpy
+scikit-learn
+pandas
+matplotlib
+ipympl
--- a/setup.py
+++ b/setup.py
@@ -0,0 +1,36 @@
+import setuptools
+
+__version__ = "0.9rc4"
+__author__ = "Ricardo Montañana Gómez"
+
+
+def readme():
+    with open("README.md") as f:
+        return f.read()
+
+
+setuptools.setup(
+    name="STree",
+    version=__version__,
+    license="MIT License",
+    description="Oblique decision tree with svm nodes",
+    long_description=readme(),
+    long_description_content_type="text/markdown",
+    packages=setuptools.find_packages(),
+    url="https://github.com/doctorado-ml/stree",
+    author=__author__,
+    author_email="ricardo.montanana@alu.uclm.es",
+    keywords="scikit-learn oblique-classifier oblique-decision-tree decision-\
+    tree svm svc",
+    classifiers=[
+        "Development Status :: 4 - Beta",
+        "License :: OSI Approved :: MIT License",
+        "Programming Language :: Python :: 3.7",
+        "Natural Language :: English",
+        "Topic :: Scientific/Engineering :: Artificial Intelligence",
+        "Intended Audience :: Science/Research",
+    ],
+    install_requires=["scikit-learn>=0.23.0", "numpy", "matplotlib", "ipympl"],
+    test_suite="stree.tests",
+    zip_safe=False,
+)
--- a/stree/Strees.py
+++ b/stree/Strees.py
@@ -0,0 +1,449 @@
+"""
+__author__ = "Ricardo Montañana Gómez"
+__copyright__ = "Copyright 2020, Ricardo Montañana Gómez"
+__license__ = "MIT"
+__version__ = "0.9"
+Build an oblique tree classifier based on SVM Trees
+"""
+
+import os
+
+import numpy as np
+from sklearn.base import BaseEstimator, ClassifierMixin
+from sklearn.svm import SVC, LinearSVC
+from sklearn.utils import check_consistent_length
+from sklearn.utils.multiclass import check_classification_targets
+from sklearn.utils.validation import (
+    check_X_y,
+    check_array,
+    check_is_fitted,
+    _check_sample_weight,
+)
+from sklearn.metrics._classification import _weighted_sum, _check_targets
+
+
+class Snode:
+    """Nodes of the tree that keeps the svm classifier and if testing the
+    dataset assigned to it
+    """
+
+    def __init__(self, clf: SVC, X: np.ndarray, y: np.ndarray, title: str):
+        self._clf = clf
+        self._title = title
+        self._belief = 0.0
+        # Only store dataset in Testing
+        self._X = X if os.environ.get("TESTING", "NS") != "NS" else None
+        self._y = y
+        self._down = None
+        self._up = None
+        self._class = None
+
+    @classmethod
+    def copy(cls, node: "Snode") -> "Snode":
+        return cls(node._clf, node._X, node._y, node._title)
+
+    def set_down(self, son):
+        self._down = son
+
+    def set_up(self, son):
+        self._up = son
+
+    def is_leaf(self) -> bool:
+        return self._up is None and self._down is None
+
+    def get_down(self) -> "Snode":
+        return self._down
+
+    def get_up(self) -> "Snode":
+        return self._up
+
+    def make_predictor(self):
+        """Compute the class of the predictor and its belief based on the
+        subdataset of the node only if it is a leaf
+        """
+        if not self.is_leaf():
+            return
+        classes, card = np.unique(self._y, return_counts=True)
+        if len(classes) > 1:
+            max_card = max(card)
+            min_card = min(card)
+            self._class = classes[card == max_card][0]
+            self._belief = max_card / (max_card + min_card)
+        else:
+            self._belief = 1
+            try:
+                self._class = classes[0]
+            except IndexError:
+                self._class = None
+
+    def __str__(self) -> str:
+        if self.is_leaf():
+            count_values = np.unique(self._y, return_counts=True)
+            result = (
+                f"{self._title} - Leaf class={self._class} belief="
+                f"{self._belief: .6f} counts={count_values}"
+            )
+            return result
+        else:
+            return f"{self._title}"
+
+
+class Siterator:
+    """Stree preorder iterator
+    """
+
+    def __init__(self, tree: Snode):
+        self._stack = []
+        self._push(tree)
+
+    def __iter__(self):
+        return self
+
+    def _push(self, node: Snode):
+        if node is not None:
+            self._stack.append(node)
+
+    def __next__(self) -> Snode:
+        if len(self._stack) == 0:
+            raise StopIteration()
+        node = self._stack.pop()
+        self._push(node.get_up())
+        self._push(node.get_down())
+        return node
+
+
+class Stree(BaseEstimator, ClassifierMixin):
+    """Estimator that is based on binary trees of svm nodes
+    can deal with sample_weights in predict, used in boosting sklearn methods
+    inheriting from BaseEstimator implements get_params and set_params methods
+    inheriting from ClassifierMixin implement the attribute _estimator_type
+    with "classifier" as value
+    """
+
+    def __init__(
+        self,
+        C: float = 1.0,
+        kernel: str = "linear",
+        max_iter: int = 1000,
+        random_state: int = None,
+        max_depth: int = None,
+        tol: float = 1e-4,
+        degree: int = 3,
+        gamma="scale",
+        split_criteria="max_samples",
+        min_samples_split: int = 0,
+    ):
+        self.max_iter = max_iter
+        self.C = C
+        self.kernel = kernel
+        self.random_state = random_state
+        self.max_depth = max_depth
+        self.tol = tol
+        self.gamma = gamma
+        self.degree = degree
+        self.min_samples_split = min_samples_split
+        self.split_criteria = split_criteria
+
+    def _more_tags(self) -> dict:
+        """Required by sklearn to supply features of the classifier
+
+        :return: the tag required
+        :rtype: dict
+        """
+        return {"requires_y": True}
+
+    def _split_array(self, origin: np.array, down: np.array) -> list:
+        """Split an array in two based on indices (down) and its complement
+
+        :param origin: dataset to split
+        :type origin: np.array
+        :param down: indices to use to split array
+        :type down: np.array
+        :return: list with two splits of the array
+        :rtype: list
+        """
+        up = ~down
+        return (
+            origin[up] if any(up) else None,
+            origin[down] if any(down) else None,
+        )
+
+    def _distances(self, node: Snode, data: np.ndarray) -> np.array:
+        """Compute distances of the samples to the hyperplane of the node
+
+        :param node: node containing the svm classifier
+        :type node: Snode
+        :param data: samples to find out distance to hyperplane
+        :type data: np.ndarray
+        :return: array of shape (m, 1) with the distances of every sample to
+        the hyperplane of the node
+        :rtype: np.array
+        """
+        return node._clf.decision_function(data)
+
+    def _min_distance(self, data: np.array, _) -> np.array:
+        # chooses the lowest distance of every sample
+        indices = np.argmin(np.abs(data), axis=1)
+        return np.take(data, indices)
+
+    def _max_samples(self, data: np.array, y: np.array) -> np.array:
+        # select the class with max number of samples
+        _, samples = np.unique(y, return_counts=True)
+        selected = np.argmax(samples)
+        return data[:, selected]
+
+    def _split_criteria(self, data: np.array, node: Snode) -> np.array:
+        """Set the criteria to split arrays
+
+        :param data: distances of samples to hyperplanes shape (m, nclasses)
+        if nclasses > 2 else (m,)
+        :type data: np.array
+        :param node: node containing the svm classifier
+        :type node: Snode
+        :return: array of booleans of samples under or above zero
+        :rtype: np.array
+        """
+
+        if data.shape[0] < self.min_samples_split:
+            return np.ones((data.shape[0]), dtype=bool)
+        if data.ndim > 1:
+            # split criteria for multiclass
+            data = getattr(self, f"_{self.split_criteria}")(data, node._y)
+        res = data > 0
+        return res
+
+    def fit(
+        self, X: np.ndarray, y: np.ndarray, sample_weight: np.array = None
+    ) -> "Stree":
+        """Build the tree based on the dataset of samples and its labels
+
+        :param X: dataset of samples to make predictions
+        :type X: np.array
+        :param y: samples labels
+        :type y: np.array
+        :param sample_weight: weights of the samples. Rescale C per sample.
+        Hi' weights force the classifier to put more emphasis on these points
+        :type sample_weight: np.array optional
+        :raises ValueError: if parameters C or max_depth are out of bounds
+        :return: itself to be able to chain actions: fit().predict() ...
+        :rtype: Stree
+        """
+        # Check parameters are Ok.
+        if self.C < 0:
+            raise ValueError(
+                f"Penalty term must be positive... got (C={self.C:f})"
+            )
+        self.__max_depth = (
+            np.iinfo(np.int32).max
+            if self.max_depth is None
+            else self.max_depth
+        )
+        if self.__max_depth < 1:
+            raise ValueError(
+                f"Maximum depth has to be greater than 1... got (max_depth=\
+                    {self.max_depth})"
+            )
+        if self.split_criteria not in ["min_distance", "max_samples"]:
+            raise ValueError(
+                f"split_criteria has to be min_distance or \
+                max_samples got ({self.split_criteria})"
+            )
+
+        check_classification_targets(y)
+        X, y = check_X_y(X, y)
+        sample_weight = _check_sample_weight(sample_weight, X)
+        check_classification_targets(y)
+        # Initialize computed parameters
+        self.classes_, y = np.unique(y, return_inverse=True)
+        self.n_classes_ = self.classes_.shape[0]
+        self.n_iter_ = self.max_iter
+        self.depth_ = 0
+        self.n_features_in_ = X.shape[1]
+        self.tree_ = self.train(X, y, sample_weight, 1, "root")
+        self._build_predictor()
+        return self
+
+    def train(
+        self,
+        X: np.ndarray,
+        y: np.ndarray,
+        sample_weight: np.ndarray,
+        depth: int,
+        title: str,
+    ) -> Snode:
+        """Recursive function to split the original dataset into predictor
+        nodes (leaves)
+
+        :param X: samples dataset
+        :type X: np.ndarray
+        :param y: samples labels
+        :type y: np.ndarray
+        :param sample_weight: weight of samples. Rescale C per sample.
+        Hi weights force the classifier to put more emphasis on these points.
+        :type sample_weight: np.ndarray
+        :param depth: actual depth in the tree
+        :type depth: int
+        :param title: description of the node
+        :type title: str
+        :return: binary tree
+        :rtype: Snode
+        """
+        if depth > self.__max_depth:
+            return None
+        if np.unique(y).shape[0] == 1:
+            # only 1 class => pure dataset
+            return Snode(None, X, y, title + ", <pure>")
+        # Train the model
+        clf = self._build_clf()
+        clf.fit(X, y, sample_weight=sample_weight)
+        node = Snode(clf, X, y, title)
+        self.depth_ = max(depth, self.depth_)
+        down = self._split_criteria(self._distances(node, X), node)
+        X_U, X_D = self._split_array(X, down)
+        y_u, y_d = self._split_array(y, down)
+        sw_u, sw_d = self._split_array(sample_weight, down)
+        if X_U is None or X_D is None:
+            # didn't part anything
+            return Snode(clf, X, y, title + ", <cgaf>")
+        node.set_up(self.train(X_U, y_u, sw_u, depth + 1, title + " - Up"))
+        node.set_down(self.train(X_D, y_d, sw_d, depth + 1, title + " - Down"))
+        return node
+
+    def _build_predictor(self):
+        """Process the leaves to make them predictors
+        """
+
+        def run_tree(node: Snode):
+            if node.is_leaf():
+                node.make_predictor()
+                return
+            run_tree(node.get_down())
+            run_tree(node.get_up())
+
+        run_tree(self.tree_)
+
+    def _build_clf(self):
+        """ Build the correct classifier for the node
+        """
+        return (
+            LinearSVC(
+                max_iter=self.max_iter,
+                random_state=self.random_state,
+                C=self.C,
+                tol=self.tol,
+            )
+            if self.kernel == "linear"
+            else SVC(
+                kernel=self.kernel,
+                max_iter=self.max_iter,
+                tol=self.tol,
+                C=self.C,
+                gamma=self.gamma,
+                degree=self.degree,
+            )
+        )
+
+    def _reorder_results(self, y: np.array, indices: np.array) -> np.array:
+        """Reorder an array based on the array of indices passed
+
+        :param y: data untidy
+        :type y: np.array
+        :param indices: indices used to set order
+        :type indices: np.array
+        :return: array y ordered
+        :rtype: np.array
+        """
+        # return array of same type given in y
+        y_ordered = y.copy()
+        indices = indices.astype(int)
+        for i, index in enumerate(indices):
+            y_ordered[index] = y[i]
+        return y_ordered
+
+    def predict(self, X: np.array) -> np.array:
+        """Predict labels for each sample in dataset passed
+
+        :param X: dataset of samples
+        :type X: np.array
+        :return: array of labels
+        :rtype: np.array
+        """
+
+        def predict_class(
+            xp: np.array, indices: np.array, node: Snode
+        ) -> np.array:
+            if xp is None:
+                return [], []
+            if node.is_leaf():
+                # set a class for every sample in dataset
+                prediction = np.full((xp.shape[0], 1), node._class)
+                return prediction, indices
+            down = self._split_criteria(self._distances(node, xp), node)
+            x_u, x_d = self._split_array(xp, down)
+            i_u, i_d = self._split_array(indices, down)
+            prx_u, prin_u = predict_class(x_u, i_u, node.get_up())
+            prx_d, prin_d = predict_class(x_d, i_d, node.get_down())
+            return np.append(prx_u, prx_d), np.append(prin_u, prin_d)
+
+        # sklearn check
+        check_is_fitted(self, ["tree_"])
+        # Input validation
+        X = check_array(X)
+        # setup prediction & make it happen
+        indices = np.arange(X.shape[0])
+        result = (
+            self._reorder_results(*predict_class(X, indices, self.tree_))
+            .astype(int)
+            .ravel()
+        )
+        return self.classes_[result]
+
+    def score(
+        self, X: np.array, y: np.array, sample_weight: np.array = None
+    ) -> float:
+        """Compute accuracy of the prediction
+
+        :param X: dataset of samples to make predictions
+        :type X: np.array
+        :param y_true: samples labels
+        :type y_true: np.array
+        :param sample_weight: weights of the samples. Rescale C per sample.
+        Hi' weights force the classifier to put more emphasis on these points
+        :type sample_weight: np.array optional
+        :return: accuracy of the prediction
+        :rtype: float
+        """
+        # sklearn check
+        check_is_fitted(self)
+        check_classification_targets(y)
+        X, y = check_X_y(X, y)
+        y_pred = self.predict(X).reshape(y.shape)
+        # Compute accuracy for each possible representation
+        y_type, y_true, y_pred = _check_targets(y, y_pred)
+        check_consistent_length(y_true, y_pred, sample_weight)
+        score = y_true == y_pred
+        return _weighted_sum(score, sample_weight, normalize=True)
+
+    def __iter__(self) -> Siterator:
+        """Create an iterator to be able to visit the nodes of the tree in
+        preorder, can make a list with all the nodes in preorder
+
+        :return: an iterator, can for i in... and list(...)
+        :rtype: Siterator
+        """
+        try:
+            tree = self.tree_
+        except AttributeError:
+            tree = None
+        return Siterator(tree)
+
+    def __str__(self) -> str:
+        """String representation of the tree
+
+        :return: description of nodes in the tree in preorder
+        :rtype: str
+        """
+        output = ""
+        for i in self:
+            output += str(i) + "\n"
+        return output
--- a/stree/Strees_grapher.py
+++ b/stree/Strees_grapher.py
@@ -0,0 +1,205 @@
+"""
+__author__ = "Ricardo Montañana Gómez"
+__copyright__ = "Copyright 2020, Ricardo Montañana Gómez"
+__license__ = "MIT"
+__version__ = "0.9"
+Plot 3D views of nodes in Stree
+"""
+
+import os
+
+import matplotlib.pyplot as plt
+import numpy as np
+from sklearn.decomposition import PCA
+from mpl_toolkits.mplot3d import Axes3D
+
+from .Strees import Stree, Snode, Siterator
+
+
+class Snode_graph(Snode):
+    def __init__(self, node: Stree):
+        self._plot_size = (8, 8)
+        self._xlimits = (None, None)
+        self._ylimits = (None, None)
+        self._zlimits = (None, None)
+        n = Snode.copy(node)
+        super().__init__(n._clf, n._X, n._y, n._title)
+
+    def set_plot_size(self, size: tuple):
+        self._plot_size = size
+
+    def get_plot_size(self) -> tuple:
+        return self._plot_size
+
+    def _is_pure(self) -> bool:
+        """is considered pure a leaf node with one label
+        """
+        if self.is_leaf():
+            return self._belief == 1.0
+        return False
+
+    def set_axis_limits(self, limits: tuple):
+        self._xlimits, self._ylimits, self._zlimits = limits
+
+    def get_axis_limits(self) -> tuple:
+        return self._xlimits, self._ylimits, self._zlimits
+
+    def _set_graphics_axis(self, ax: Axes3D):
+        ax.set_xlim(self._xlimits)
+        ax.set_ylim(self._ylimits)
+        ax.set_zlim(self._zlimits)
+
+    def save_hyperplane(
+        self, save_folder: str = "./", save_prefix: str = "", save_seq: int = 1
+    ):
+        _, fig = self.plot_hyperplane()
+        name = os.path.join(save_folder, f"{save_prefix}STnode{save_seq}.png")
+        fig.savefig(name, bbox_inches="tight")
+        plt.close(fig)
+
+    def _get_cmap(self):
+        cmap = "jet"
+        if self._is_pure() and self._class == 1:
+            cmap = "jet_r"
+        return cmap
+
+    def _graph_title(self):
+        n_class, card = np.unique(self._y, return_counts=True)
+        return f"{self._title} {n_class} {card}"
+
+    def plot_hyperplane(self, plot_distribution: bool = True):
+        fig = plt.figure(figsize=self._plot_size)
+        ax = fig.add_subplot(1, 1, 1, projection="3d")
+        if not self._is_pure():
+            # Can't plot hyperplane of leaves with one label because it hasn't
+            # classiffier
+            # get the splitting hyperplane
+            def hyperplane(x, y):
+                return (
+                    -self._clf.intercept_
+                    - self._clf.coef_[0][0] * x
+                    - self._clf.coef_[0][1] * y
+                ) / self._clf.coef_[0][2]
+
+            tmpx = np.linspace(self._X[:, 0].min(), self._X[:, 0].max())
+            tmpy = np.linspace(self._X[:, 1].min(), self._X[:, 1].max())
+            xx, yy = np.meshgrid(tmpx, tmpy)
+            ax.plot_surface(
+                xx,
+                yy,
+                hyperplane(xx, yy),
+                alpha=0.5,
+                antialiased=True,
+                rstride=1,
+                cstride=1,
+                cmap="seismic",
+            )
+            self._set_graphics_axis(ax)
+        if plot_distribution:
+            self.plot_distribution(ax)
+        else:
+            plt.title(self._graph_title())
+            plt.show()
+        return ax, fig
+
+    def plot_distribution(self, ax: Axes3D = None):
+        if ax is None:
+            fig = plt.figure(figsize=self._plot_size)
+            ax = fig.add_subplot(1, 1, 1, projection="3d")
+        plt.title(self._graph_title())
+        cmap = self._get_cmap()
+        ax.scatter(
+            self._X[:, 0], self._X[:, 1], self._X[:, 2], c=self._y, cmap=cmap
+        )
+        ax.set_xlabel("X0")
+        ax.set_ylabel("X1")
+        ax.set_zlabel("X2")
+        plt.show()
+
+
+class Stree_grapher(Stree):
+    """Build 3d graphs of any dataset, if it's more than 3 features PCA shall
+    make its magic
+    """
+
+    def __init__(self, params: dict):
+        self._plot_size = (8, 8)
+        self._tree_gr = None
+        # make Snode store X's
+        os.environ["TESTING"] = "1"
+        self._fitted = False
+        self._pca = None
+        super().__init__(**params)
+
+    def __del__(self):
+        try:
+            os.environ.pop("TESTING")
+        except KeyError:
+            pass
+
+    def _copy_tree(self, node: Snode) -> Snode_graph:
+        mirror = Snode_graph(node)
+        # clone node
+        mirror._class = node._class
+        mirror._belief = node._belief
+        if node.get_down() is not None:
+            mirror.set_down(self._copy_tree(node.get_down()))
+        if node.get_up() is not None:
+            mirror.set_up(self._copy_tree(node.get_up()))
+        return mirror
+
+    def fit(
+        self, X: np.array, y: np.array, sample_weight: np.array = None
+    ) -> "Stree_grapher":
+        """Fit the Stree and copy the tree in a Snode_graph tree
+
+        :param X: Dataset
+        :type X: np.array
+        :param y: Labels
+        :type y: np.array
+        :return: Stree model
+        :rtype: Stree
+        """
+        if X.shape[1] != 3:
+            self._pca = PCA(n_components=3)
+            X = self._pca.fit_transform(X)
+        super().fit(X, y, sample_weight=sample_weight)
+        self._tree_gr = self._copy_tree(self.tree_)
+        self._fitted = True
+        return self
+
+    def score(self, X: np.array, y: np.array) -> float:
+        self._check_fitted()
+        if X.shape[1] != 3:
+            X = self._pca.transform(X)
+        return super().score(X, y)
+
+    def _check_fitted(self):
+        if not self._fitted:
+            raise Exception("Have to fit the grapher first!")
+
+    def save_all(self, save_folder: str = "./", save_prefix: str = ""):
+        """Save all the node plots in png format, each with a sequence number
+
+        :param save_folder: folder where the plots are saved, defaults to './'
+        :type save_folder: str, optional
+        """
+        self._check_fitted()
+        if not os.path.isdir(save_folder):
+            os.mkdir(save_folder)
+        seq = 1
+        for node in self:
+            node.save_hyperplane(
+                save_folder=save_folder, save_prefix=save_prefix, save_seq=seq
+            )
+            seq += 1
+
+    def plot_all(self):
+        """Plots all the nodes
+        """
+        self._check_fitted()
+        for node in self:
+            node.plot_hyperplane()
+
+    def __iter__(self):
+        return Siterator(self._tree_gr)
--- a/stree/init.py
+++ b/stree/init.py
@@ -0,0 +1,4 @@
+from .Strees import Stree, Snode, Siterator
+from .Strees_grapher import Stree_grapher, Snode_graph
+
+__all__ = ["Stree", "Snode", "Siterator", "Stree_grapher", "Snode_graph"]
--- a/stree/tests/Strees_grapher_test.py
+++ b/stree/tests/Strees_grapher_test.py
@@ -0,0 +1,226 @@
+import os
+import imghdr
+import unittest
+
+import numpy as np
+import matplotlib
+import matplotlib.pyplot as plt
+import warnings
+from sklearn.datasets import make_classification
+
+from stree import Stree_grapher, Snode_graph, Snode
+
+
+def get_dataset(random_state=0, n_features=3):
+    X, y = make_classification(
+        n_samples=1500,
+        n_features=n_features,
+        n_informative=3,
+        n_redundant=0,
+        n_repeated=0,
+        n_classes=2,
+        n_clusters_per_class=2,
+        class_sep=1.5,
+        flip_y=0,
+        weights=[0.5, 0.5],
+        random_state=random_state,
+    )
+    return X, y
+
+
+class Stree_grapher_test(unittest.TestCase):
+    def __init__(self, *args, **kwargs):
+        self._random_state = 1
+        self._clf = Stree_grapher(dict(random_state=self._random_state))
+        self._clf.fit(*get_dataset(self._random_state, n_features=4))
+        super().__init__(*args, **kwargs)
+
+    @classmethod
+    def setUp(cls):
+        os.environ["TESTING"] = "1"
+
+    def test_iterator(self):
+        """Check preorder iterator
+        """
+        expected = [
+            "root",
+            "root - Down",
+            "root - Down - Down, <cgaf> - Leaf class=1 belief= 0.976023 counts"
+            "=(array([0, 1]), array([ 17, 692]))",
+            "root - Down - Up",
+            "root - Down - Up - Down, <cgaf> - Leaf class=0 belief= 0.500000 "
+            "counts=(array([0, 1]), array([1, 1]))",
+            "root - Down - Up - Up, <cgaf> - Leaf class=0 belief= 0.888889 "
+            "counts=(array([0, 1]), array([8, 1]))",
+            "root - Up, <cgaf> - Leaf class=0 belief= 0.928205 counts=(array("
+            "[0, 1]), array([724,  56]))",
+        ]
+        computed = []
+        for node in self._clf:
+            computed.append(str(node))
+        self.assertListEqual(expected, computed)
+
+    def test_score(self):
+        X, y = get_dataset(self._random_state)
+        accuracy_score = self._clf.score(X, y)
+        yp = self._clf.predict(X)
+        accuracy_computed = np.mean(yp == y)
+        self.assertEqual(accuracy_score, accuracy_computed)
+        self.assertGreater(accuracy_score, 0.86)
+
+    def test_score_4dims(self):
+        X, y = get_dataset(self._random_state, n_features=4)
+        accuracy_score = self._clf.score(X, y)
+        self.assertEqual(accuracy_score, 0.95)
+
+    def test_save_all(self):
+        folder_name = os.path.join(os.sep, "tmp", "stree")
+        if os.path.isdir(folder_name):
+            os.rmdir(folder_name)
+        file_names = [
+            os.path.join(folder_name, f"STnode{i}.png") for i in range(1, 8)
+        ]
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            matplotlib.use("Agg")
+            self._clf.save_all(save_folder=folder_name)
+        for file_name in file_names:
+            self.assertTrue(os.path.exists(file_name))
+            self.assertEqual("png", imghdr.what(file_name))
+            os.remove(file_name)
+        os.rmdir(folder_name)
+
+    def test_plot_all(self):
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            matplotlib.use("Agg")
+            num_figures_before = plt.gcf().number
+            self._clf.plot_all()
+            num_figures_after = plt.gcf().number
+        self.assertEqual(7, num_figures_after - num_figures_before)
+
+
+class Snode_graph_test(unittest.TestCase):
+    def __init__(self, *args, **kwargs):
+        self._random_state = 1
+        self._clf = Stree_grapher(dict(random_state=self._random_state))
+        self._clf.fit(*get_dataset(self._random_state))
+        super().__init__(*args, **kwargs)
+
+    @classmethod
+    def setUp(cls):
+        os.environ["TESTING"] = "1"
+
+    def test_plot_size(self):
+        default = self._clf._tree_gr.get_plot_size()
+        expected = (17, 3)
+        self._clf._tree_gr.set_plot_size(expected)
+        self.assertEqual(expected, self._clf._tree_gr.get_plot_size())
+        self._clf._tree_gr.set_plot_size(default)
+        self.assertEqual(default, self._clf._tree_gr.get_plot_size())
+
+    def test_attributes_in_leaves_graph(self):
+        """Check if the attributes in leaves have correct values so they form a
+        predictor
+        """
+
+        def check_leave(node: Snode_graph):
+            if not node.is_leaf():
+                check_leave(node.get_down())
+                check_leave(node.get_up())
+                return
+            # Check Belief in leave
+            classes, card = np.unique(node._y, return_counts=True)
+            max_card = max(card)
+            min_card = min(card)
+            if len(classes) > 1:
+                try:
+                    belief = max_card / (max_card + min_card)
+                except ZeroDivisionError:
+                    belief = 0.0
+            else:
+                belief = 1
+            self.assertEqual(belief, node._belief)
+            # Check Class
+            class_computed = classes[card == max_card]
+            self.assertEqual(class_computed, node._class)
+
+        check_leave(self._clf._tree_gr)
+
+    def test_nodes_graph_coefs(self):
+        """Check if the nodes of the tree have the right attributes filled
+        """
+
+        def run_tree(node: Snode_graph):
+            if node._belief < 1:
+                # only exclude pure leaves
+                self.assertIsNotNone(node._clf)
+                self.assertIsNotNone(node._clf.coef_)
+            if node.is_leaf():
+                return
+            run_tree(node.get_down())
+            run_tree(node.get_up())
+
+        run_tree(self._clf._tree_gr)
+
+    def test_save_hyperplane(self):
+        folder_name = "/tmp/"
+        file_name = os.path.join(folder_name, "STnode1.png")
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            matplotlib.use("Agg")
+            self._clf._tree_gr.save_hyperplane(folder_name)
+        self.assertTrue(os.path.exists(file_name))
+        self.assertEqual("png", imghdr.what(file_name))
+        os.remove(file_name)
+
+    def test_plot_hyperplane_with_distribution(self):
+        plt.close()
+        # select a pure node
+        node = self._clf._tree_gr.get_down().get_up().get_up()
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            matplotlib.use("Agg")
+            num_figures_before = plt.gcf().number
+            node.plot_hyperplane(plot_distribution=True)
+            num_figures_after = plt.gcf().number
+        self.assertEqual(1, num_figures_after - num_figures_before)
+
+    def test_plot_hyperplane_without_distribution(self):
+        plt.close()
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            matplotlib.use("Agg")
+            num_figures_before = plt.gcf().number
+            self._clf._tree_gr.plot_hyperplane(plot_distribution=False)
+            num_figures_after = plt.gcf().number
+        self.assertEqual(1, num_figures_after - num_figures_before)
+
+    def test_plot_distribution(self):
+        plt.close()
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            matplotlib.use("Agg")
+            num_figures_before = plt.gcf().number
+            self._clf._tree_gr.plot_distribution()
+            num_figures_after = plt.gcf().number
+        self.assertEqual(1, num_figures_after - num_figures_before)
+
+    def test_set_axis_limits(self):
+        node = Snode_graph(Snode(None, None, None, "test"))
+        limits = (-2, 2), (-3, 3), (-4, 4)
+        node.set_axis_limits(limits)
+        computed = node.get_axis_limits()
+        x, y, z = limits
+        xx, yy, zz = computed
+        self.assertEqual(x, xx)
+        self.assertEqual(y, yy)
+        self.assertEqual(z, zz)
+
+    def test_cmap_change(self):
+        node = Snode_graph(Snode(None, None, None, "test"))
+        self.assertEqual("jet", node._get_cmap())
+        # make node pure
+        node._belief = 1.0
+        node._class = 1
+        self.assertEqual("jet_r", node._get_cmap())
--- a/stree/tests/Strees_test.py
+++ b/stree/tests/Strees_test.py
@@ -0,0 +1,355 @@
+import os
+import unittest
+
+import numpy as np
+from sklearn.datasets import make_classification, load_iris
+
+from stree import Stree, Snode
+
+
+def get_dataset(random_state=0, n_classes=2):
+    X, y = make_classification(
+        n_samples=1500,
+        n_features=3,
+        n_informative=3,
+        n_redundant=0,
+        n_repeated=0,
+        n_classes=n_classes,
+        n_clusters_per_class=2,
+        class_sep=1.5,
+        flip_y=0,
+        random_state=random_state,
+    )
+    return X, y
+
+
+class Stree_test(unittest.TestCase):
+    def __init__(self, *args, **kwargs):
+        self._random_state = 1
+        self._kernels = ["linear", "rbf", "poly"]
+        super().__init__(*args, **kwargs)
+
+    @classmethod
+    def setUp(cls):
+        os.environ["TESTING"] = "1"
+
+    def _check_tree(self, node: Snode):
+        """Check recursively that the nodes that are not leaves have the
+        correct number of labels and its sons have the right number of elements
+        in their dataset
+
+        Arguments:
+            node {Snode} -- node to check
+        """
+        if node.is_leaf():
+            return
+        y_prediction = node._clf.predict(node._X)
+        y_down = node.get_down()._y
+        y_up = node.get_up()._y
+        # Is a correct partition in terms of cadinality?
+        # i.e. The partition algorithm didn't forget any sample
+        self.assertEqual(node._y.shape[0], y_down.shape[0] + y_up.shape[0])
+        unique_y, count_y = np.unique(node._y, return_counts=True)
+        _, count_d = np.unique(y_down, return_counts=True)
+        _, count_u = np.unique(y_up, return_counts=True)
+        #
+        for i in unique_y:
+            try:
+                number_down = count_d[i]
+            except IndexError:
+                number_down = 0
+            try:
+                number_up = count_u[i]
+            except IndexError:
+                number_up = 0
+            self.assertEqual(count_y[i], number_down + number_up)
+        # Is the partition made the same as the prediction?
+        # as the node is not a leaf...
+        _, count_yp = np.unique(y_prediction, return_counts=True)
+        self.assertEqual(count_yp[0], y_up.shape[0])
+        self.assertEqual(count_yp[1], y_down.shape[0])
+        self._check_tree(node.get_down())
+        self._check_tree(node.get_up())
+
+    def test_build_tree(self):
+        """Check if the tree is built the same way as predictions of models
+        """
+        import warnings
+
+        warnings.filterwarnings("ignore")
+        for kernel in self._kernels:
+            clf = Stree(kernel=kernel, random_state=self._random_state)
+            clf.fit(*get_dataset(self._random_state))
+            self._check_tree(clf.tree_)
+
+    def _find_out(
+        self, px: np.array, x_original: np.array, y_original
+    ) -> list:
+        """Find the original values of y for a given array of samples
+
+        Arguments:
+            px {np.array} -- array of samples to search for
+            x_original {np.array} -- original dataset
+            y_original {[type]} -- original classes
+
+        Returns:
+            np.array -- classes of the given samples
+        """
+        res = []
+        for needle in px:
+            for row in range(x_original.shape[0]):
+                if all(x_original[row, :] == needle):
+                    res.append(y_original[row])
+        return res
+
+    def test_single_prediction(self):
+        X, y = get_dataset(self._random_state)
+        for kernel in self._kernels:
+            clf = Stree(kernel=kernel, random_state=self._random_state)
+            yp = clf.fit(X, y).predict((X[0, :].reshape(-1, X.shape[1])))
+            self.assertEqual(yp[0], y[0])
+
+    def test_multiple_prediction(self):
+        # First 27 elements the predictions are the same as the truth
+        num = 27
+        X, y = get_dataset(self._random_state)
+        for kernel in self._kernels:
+            clf = Stree(kernel=kernel, random_state=self._random_state)
+            yp = clf.fit(X, y).predict(X[:num, :])
+            self.assertListEqual(y[:num].tolist(), yp.tolist())
+
+    def test_score(self):
+        X, y = get_dataset(self._random_state)
+        accuracies = [
+            0.9506666666666667,
+            0.9606666666666667,
+            0.9433333333333334,
+        ]
+        for kernel, accuracy_expected in zip(self._kernels, accuracies):
+            clf = Stree(random_state=self._random_state, kernel=kernel,)
+            clf.fit(X, y)
+            accuracy_score = clf.score(X, y)
+            yp = clf.predict(X)
+            accuracy_computed = np.mean(yp == y)
+            self.assertEqual(accuracy_score, accuracy_computed)
+            self.assertAlmostEqual(accuracy_expected, accuracy_score)
+
+    def test_single_vs_multiple_prediction(self):
+        """Check if predicting sample by sample gives the same result as
+        predicting all samples at once
+        """
+        X, y = get_dataset(self._random_state)
+        for kernel in self._kernels:
+            clf = Stree(kernel=kernel, random_state=self._random_state)
+            clf.fit(X, y)
+            # Compute prediction line by line
+            yp_line = np.array([], dtype=int)
+            for xp in X:
+                yp_line = np.append(
+                    yp_line, clf.predict(xp.reshape(-1, X.shape[1]))
+                )
+            # Compute prediction at once
+            yp_once = clf.predict(X)
+            self.assertListEqual(yp_line.tolist(), yp_once.tolist())
+
+    def test_iterator_and_str(self):
+        """Check preorder iterator
+        """
+        expected = [
+            "root",
+            "root - Down",
+            "root - Down - Down, <cgaf> - Leaf class=1 belief= 0.975989 counts"
+            "=(array([0, 1]), array([ 17, 691]))",
+            "root - Down - Up",
+            "root - Down - Up - Down, <cgaf> - Leaf class=1 belief= 0.750000 "
+            "counts=(array([0, 1]), array([1, 3]))",
+            "root - Down - Up - Up, <pure> - Leaf class=0 belief= 1.000000 "
+            "counts=(array([0]), array([7]))",
+            "root - Up, <cgaf> - Leaf class=0 belief= 0.928297 counts=(array("
+            "[0, 1]), array([725,  56]))",
+        ]
+        computed = []
+        expected_string = ""
+        clf = Stree(kernel="linear", random_state=self._random_state)
+        clf.fit(*get_dataset(self._random_state))
+        for node in clf:
+            computed.append(str(node))
+            expected_string += str(node) + "\n"
+        self.assertListEqual(expected, computed)
+        self.assertEqual(expected_string, str(clf))
+
+    def test_is_a_sklearn_classifier(self):
+        import warnings
+        from sklearn.exceptions import ConvergenceWarning
+
+        warnings.filterwarnings("ignore", category=ConvergenceWarning)
+        warnings.filterwarnings("ignore", category=RuntimeWarning)
+        from sklearn.utils.estimator_checks import check_estimator
+
+        check_estimator(Stree())
+
+    def test_exception_if_C_is_negative(self):
+        tclf = Stree(C=-1)
+        with self.assertRaises(ValueError):
+            tclf.fit(*get_dataset(self._random_state))
+
+    def test_exception_if_bogus_split_criteria(self):
+        tclf = Stree(split_criteria="duck")
+        with self.assertRaises(ValueError):
+            tclf.fit(*get_dataset(self._random_state))
+
+    def test_check_max_depth_is_positive_or_None(self):
+        tcl = Stree()
+        self.assertIsNone(tcl.max_depth)
+        tcl = Stree(max_depth=1)
+        self.assertGreaterEqual(1, tcl.max_depth)
+        with self.assertRaises(ValueError):
+            tcl = Stree(max_depth=-1)
+            tcl.fit(*get_dataset(self._random_state))
+
+    def test_check_max_depth(self):
+        depths = (3, 4)
+        for depth in depths:
+            tcl = Stree(random_state=self._random_state, max_depth=depth)
+            tcl.fit(*get_dataset(self._random_state))
+            self.assertEqual(depth, tcl.depth_)
+
+    def test_unfitted_tree_is_iterable(self):
+        tcl = Stree()
+        self.assertEqual(0, len(list(tcl)))
+
+    def test_min_samples_split(self):
+        tcl_split = Stree(min_samples_split=3)
+        tcl_nosplit = Stree(min_samples_split=4)
+        dataset = [[1], [2], [3]], [1, 1, 0]
+        tcl_split.fit(*dataset)
+        self.assertIsNotNone(tcl_split.tree_.get_down())
+        self.assertIsNotNone(tcl_split.tree_.get_up())
+        tcl_nosplit.fit(*dataset)
+        self.assertIsNone(tcl_nosplit.tree_.get_down())
+        self.assertIsNone(tcl_nosplit.tree_.get_up())
+
+    def test_simple_muticlass_dataset(self):
+        for kernel in self._kernels:
+            clf = Stree(
+                kernel=kernel,
+                split_criteria="max_samples",
+                random_state=self._random_state,
+            )
+            px = [[1, 2], [5, 6], [9, 10]]
+            py = [0, 1, 2]
+            clf.fit(px, py)
+            self.assertEqual(1.0, clf.score(px, py))
+            self.assertListEqual(py, clf.predict(px).tolist())
+            self.assertListEqual(py, clf.classes_.tolist())
+
+    def test_muticlass_dataset(self):
+        datasets = {
+            "Synt": get_dataset(random_state=self._random_state, n_classes=3),
+            "Iris": load_iris(return_X_y=True),
+        }
+        outcomes = {
+            "Synt": {
+                "max_samples linear": 0.9533333333333334,
+                "max_samples rbf": 0.836,
+                "max_samples poly": 0.9473333333333334,
+                "min_distance linear": 0.9533333333333334,
+                "min_distance rbf": 0.836,
+                "min_distance poly": 0.9473333333333334,
+            },
+            "Iris": {
+                "max_samples linear": 0.98,
+                "max_samples rbf": 1.0,
+                "max_samples poly": 1.0,
+                "min_distance linear": 0.98,
+                "min_distance rbf": 1.0,
+                "min_distance poly": 1.0,
+            },
+        }
+        for name, dataset in datasets.items():
+            px, py = dataset
+            for criteria in ["max_samples", "min_distance"]:
+                for kernel in self._kernels:
+                    clf = Stree(
+                        C=1e4,
+                        max_iter=1e4,
+                        kernel=kernel,
+                        random_state=self._random_state,
+                    )
+                    clf.fit(px, py)
+                    outcome = outcomes[name][f"{criteria} {kernel}"]
+                    self.assertAlmostEqual(outcome, clf.score(px, py))
+
+
+class Snode_test(unittest.TestCase):
+    def __init__(self, *args, **kwargs):
+        self._random_state = 1
+        self._clf = Stree(random_state=self._random_state)
+        self._clf.fit(*get_dataset(self._random_state))
+        super().__init__(*args, **kwargs)
+
+    @classmethod
+    def setUp(cls):
+        os.environ["TESTING"] = "1"
+
+    def test_attributes_in_leaves(self):
+        """Check if the attributes in leaves have correct values so they form a
+        predictor
+        """
+
+        def check_leave(node: Snode):
+            if not node.is_leaf():
+                check_leave(node.get_down())
+                check_leave(node.get_up())
+                return
+            # Check Belief in leave
+            classes, card = np.unique(node._y, return_counts=True)
+            max_card = max(card)
+            min_card = min(card)
+            if len(classes) > 1:
+                try:
+                    belief = max_card / (max_card + min_card)
+                except ZeroDivisionError:
+                    belief = 0.0
+            else:
+                belief = 1
+            self.assertEqual(belief, node._belief)
+            # Check Class
+            class_computed = classes[card == max_card]
+            self.assertEqual(class_computed, node._class)
+
+        check_leave(self._clf.tree_)
+
+    def test_nodes_coefs(self):
+        """Check if the nodes of the tree have the right attributes filled
+        """
+
+        def run_tree(node: Snode):
+            if node._belief < 1:
+                # only exclude pure leaves
+                self.assertIsNotNone(node._clf)
+                self.assertIsNotNone(node._clf.coef_)
+            if node.is_leaf():
+                return
+            run_tree(node.get_down())
+            run_tree(node.get_up())
+
+        run_tree(self._clf.tree_)
+
+    def test_make_predictor_on_leaf(self):
+        test = Snode(None, [1, 2, 3, 4], [1, 0, 1, 1], "test")
+        test.make_predictor()
+        self.assertEqual(1, test._class)
+        self.assertEqual(0.75, test._belief)
+
+    def test_make_predictor_on_not_leaf(self):
+        test = Snode(None, [1, 2, 3, 4], [1, 0, 1, 1], "test")
+        test.set_up(Snode(None, [1], [1], "another_test"))
+        test.make_predictor()
+        self.assertIsNone(test._class)
+        self.assertEqual(0, test._belief)
+
+    def test_make_predictor_on_leaf_bogus_data(self):
+        test = Snode(None, [1, 2, 3, 4], [], "test")
+        test.make_predictor()
+        self.assertIsNone(test._class)
--- a/stree/tests/init.py
+++ b/stree/tests/init.py
@@ -0,0 +1,9 @@
+from .Strees_test import Stree_test, Snode_test
+from .Strees_grapher_test import Stree_grapher_test, Snode_graph_test
+
+__all__ = [
+    "Stree_test",
+    "Snode_test",
+    "Stree_grapher_test",
+    "Snode_graph_test",
+]
--- a/test2.ipynb
+++ b/test2.ipynb
@@ -1,249 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy as np\n",
-    "import pandas as pd\n",
-    "from sklearn.svm import LinearSVC\n",
-    "from sklearn.tree import DecisionTreeClassifier\n",
-    "from sklearn.datasets import make_classification, load_iris, load_wine\n",
-    "from trees.Stree import Stree\n",
-    "import time"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "if not os.path.isfile('data/creditcard.csv'):\n",
-    "    !wget --no-check-certificate --content-disposition http://nube.jccm.es/index.php/s/Zs7SYtZQJ3RQ2H2/download\n",
-    "    !tar xzf creditcard.tgz"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "Fraud: 0.173% 492\nValid: 99.827% 284315\nX.shape (1492, 28)  y.shape (1492,)\nFraud: 32.976% 492\nValid: 67.024% 1000\n"
-    }
-   ],
-   "source": [
-    "import time\n",
-    "from sklearn.model_selection import train_test_split\n",
-    "from trees.Stree import Stree\n",
-    "\n",
-    "random_state=1\n",
-    "\n",
-    "def load_creditcard(n_examples=0):\n",
-    "    import pandas as pd\n",
-    "    import numpy as np\n",
-    "    import random\n",
-    "    df = pd.read_csv('data/creditcard.csv')\n",
-    "    print(\"Fraud: {0:.3f}% {1}\".format(df.Class[df.Class == 1].count()*100/df.shape[0], df.Class[df.Class == 1].count()))\n",
-    "    print(\"Valid: {0:.3f}% {1}\".format(df.Class[df.Class == 0].count()*100/df.shape[0], df.Class[df.Class == 0].count()))\n",
-    "    y = df.Class\n",
-    "    X = df.drop(['Class', 'Time', 'Amount'], axis=1).values\n",
-    "    if n_examples > 0:\n",
-    "        # Take first n_examples samples\n",
-    "        X = X[:n_examples, :]\n",
-    "        y = y[:n_examples, :]\n",
-    "    else:\n",
-    "        # Take all the positive samples with a number of random negatives\n",
-    "        if n_examples < 0:\n",
-    "            Xt = X[(y == 1).ravel()]\n",
-    "            yt = y[(y == 1).ravel()]\n",
-    "            indices = random.sample(range(X.shape[0]), -1 * n_examples)\n",
-    "            X = np.append(Xt, X[indices], axis=0)\n",
-    "            y = np.append(yt, y[indices], axis=0)\n",
-    "    print(\"X.shape\", X.shape, \" y.shape\", y.shape)\n",
-    "    print(\"Fraud: {0:.3f}% {1}\".format(len(y[y == 1])*100/X.shape[0], len(y[y == 1])))\n",
-    "    print(\"Valid: {0:.3f}% {1}\".format(len(y[y == 0]) * 100 / X.shape[0], len(y[y == 0])))\n",
-    "    Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, train_size=0.7, shuffle=True, random_state=random_state, stratify=y)\n",
-    "    return Xtrain, Xtest, ytrain, ytest\n",
-    "\n",
-    "# data = load_creditcard(-5000) # Take all true samples + 5000 of the others\n",
-    "# data = load_creditcard(5000)  # Take the first 5000 samples\n",
-    "data = load_creditcard(-1000) # Take all the samples\n",
-    "\n",
-    "Xtrain = data[0]\n",
-    "Xtest = data[1]\n",
-    "ytrain = data[2]\n",
-    "ytest = data[3]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "************** C=0.001 ****************************\nClassifier's accuracy (train): 0.9550\nClassifier's accuracy (test) : 0.9487\nroot\nroot - Down\nroot - Down - Down, <cgaf> - Leaf class=1 belief=0.977346 counts=(array([0, 1]), array([  7, 302]))\nroot - Up\nroot - Up - Down, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([1]))\nroot - Down - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([1]))\nroot - Up - Up\nroot - Up - Up - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([2]))\nroot - Up - Up - Up, <cgaf> - Leaf class=0 belief=0.945280 counts=(array([0, 1]), array([691,  40]))\n\n**************************************************\n************** C=0.01 ****************************\nClassifier's accuracy (train): 0.9569\nClassifier's accuracy (test) : 0.9576\nroot\nroot - Down, <cgaf> - Leaf class=1 belief=0.986971 counts=(array([0, 1]), array([  4, 303]))\nroot - Up, <cgaf> - Leaf class=0 belief=0.944369 counts=(array([0, 1]), array([696,  41]))\n\n**************************************************\n************** C=1 ****************************\nClassifier's accuracy (train): 0.9674\nClassifier's accuracy (test) : 0.9554\nroot\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([310]))\nroot - Up, <cgaf> - Leaf class=0 belief=0.953232 counts=(array([0, 1]), array([693,  34]))\nroot - Down - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([7]))\n\n**************************************************\n************** C=5 ****************************\nClassifier's accuracy (train): 0.9693\nClassifier's accuracy (test) : 0.9487\nroot\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([310]))\nroot - Up\nroot - Up - Down, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([1]))\nroot - Down - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([7]))\nroot - Up - Up\nroot - Up - Up - Down, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([2]))\nroot - Up - Up - Up\nroot - Up - Up - Up - Down, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([2]))\nroot - Up - Up - Up - Up\nroot - Up - Up - Up - Up - Down\nroot - Up - Up - Up - Up - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([2]))\nroot - Up - Up - Up - Up - Up, <cgaf> - Leaf class=0 belief=0.955494 counts=(array([0, 1]), array([687,  32]))\nroot - Up - Up - Up - Up - Down - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([1]))\n\n**************************************************\n************** C=17 ****************************\nClassifier's accuracy (train): 0.9780\nClassifier's accuracy (test) : 0.9487\nroot\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([301]))\nroot - Up\nroot - Up - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([2]))\nroot - Down - Up\nroot - Down - Up - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([15]))\nroot - Up - Up\nroot - Up - Up - Down\nroot - Up - Up - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([3]))\nroot - Down - Up - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([15]))\nroot - Up - Up - Up, <cgaf> - Leaf class=0 belief=0.967468 counts=(array([0, 1]), array([684,  23]))\nroot - Up - Up - Down - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([1]))\n\n**************************************************\n0.7277 secs\n"
-    }
-   ],
-   "source": [
-    "t = time.time()\n",
-    "for C in (.001, .01, 1, 5, 17):\n",
-    "    clf = Stree(C=C, random_state=random_state)\n",
-    "    clf.fit(Xtrain, ytrain)\n",
-    "    print(f\"************** C={C} ****************************\")\n",
-    "    print(f\"Classifier's accuracy (train): {clf.score(Xtrain, ytrain):.4f}\")\n",
-    "    print(f\"Classifier's accuracy (test) : {clf.score(Xtest, ytest):.4f}\")\n",
-    "    print(clf)\n",
-    "    print(f\"**************************************************\")\n",
-    "print(f\"{time.time() - t:.4f} secs\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy as np\n",
-    "from sklearn.preprocessing import StandardScaler\n",
-    "from sklearn.svm import LinearSVC\n",
-    "from sklearn.calibration import CalibratedClassifierCV\n",
-    "scaler = StandardScaler()\n",
-    "cclf = CalibratedClassifierCV(base_estimator=LinearSVC(), cv=5)\n",
-    "cclf.fit(Xtrain, ytrain)\n",
-    "res = cclf.predict_proba(Xtest)\n",
-    "#an array containing probabilities of belonging to the 1st class"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {},
-   "outputs": [
-    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "root\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([301]))\nroot - Up\nroot - Up - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([2]))\nroot - Down - Up\nroot - Down - Up - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([15]))\nroot - Up - Up\nroot - Up - Up - Down\nroot - Up - Up - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([3]))\nroot - Down - Up - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([15]))\nroot - Up - Up - Up, <cgaf> - Leaf class=0 belief=0.967468 counts=(array([0, 1]), array([684,  23]))\nroot - Up - Up - Down - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([1]))\n"
-    }
-   ],
-   "source": [
-    "for i in list(clf):\n",
-    "    print(i)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [
-    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "root\nroot - Down\nroot - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([301]))\nroot - Up\nroot - Up - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([2]))\nroot - Down - Up\nroot - Down - Up - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([15]))\nroot - Up - Up\nroot - Up - Up - Down\nroot - Up - Up - Down - Down, <pure> - Leaf class=1 belief=1.000000 counts=(array([1]), array([3]))\nroot - Down - Up - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([15]))\nroot - Up - Up - Up, <cgaf> - Leaf class=0 belief=0.967468 counts=(array([0, 1]), array([684,  23]))\nroot - Up - Up - Down - Up, <pure> - Leaf class=0 belief=1.000000 counts=(array([0]), array([1]))\n"
-    }
-   ],
-   "source": [
-    "for i in clf:\n",
-    "    print(i)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {},
-   "outputs": [
-    {
-     "output_type": "display_data",
-     "data": {
-      "text/plain": "Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …",
-      "application/vnd.jupyter.widget-view+json": {
-       "version_major": 2,
-       "version_minor": 0,
-       "model_id": "0025f832c1734afc944021e5990c2d11"
-      }
-     },
-     "metadata": {}
-    }
-   ],
-   "source": [
-    "%matplotlib widget\n",
-    "from mpl_toolkits.mplot3d import Axes3D\n",
-    "import matplotlib.pyplot as plt\n",
-    "from matplotlib import cm\n",
-    "from matplotlib.ticker import LinearLocator, FormatStrFormatter\n",
-    "import numpy as np\n",
-    "\n",
-    "fig = plt.figure()\n",
-    "ax = fig.gca(projection='3d')\n",
-    "\n",
-    "scale = 8\n",
-    "# Make data.\n",
-    "X = np.arange(-scale, scale, 0.25)\n",
-    "Y = np.arange(-scale, scale, 0.25)\n",
-    "X, Y = np.meshgrid(X, Y)\n",
-    "Z = X**2 + Y**2\n",
-    "\n",
-    "# Plot the surface.\n",
-    "surf = ax.plot_surface(X, Y, Z, cmap=cm.coolwarm,\n",
-    "                   linewidth=0, antialiased=False)\n",
-    "\n",
-    "# Customize the z axis.\n",
-    "ax.set_zlim(0, 100)\n",
-    "ax.zaxis.set_major_locator(LinearLocator(10))\n",
-    "ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))\n",
-    "\n",
-    "# rotate the axes and update\n",
-    "#for angle in range(0, 360):\n",
-    "#   ax.view_init(30, 40)\n",
-    "\n",
-    "# Add a color bar which maps values to colors.\n",
-    "fig.colorbar(surf, shrink=0.5, aspect=5)\n",
-    "\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.7.6-final"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/tests/Snode_test.py
+++ b/tests/Snode_test.py
@@ -1,72 +0,0 @@
-import os
-import unittest
-
-import numpy as np
-from sklearn.datasets import make_classification
-
-from trees.Stree import Stree, Snode
-
-
-class Snode_test(unittest.TestCase):
-
-    def __init__(self, *args, **kwargs):
-        os.environ['TESTING'] = '1'
-        self._random_state = 1
-        self._clf = Stree(random_state=self._random_state,
-                            use_predictions=True)
-        self._clf.fit(*self._get_Xy())
-        super(Snode_test, self).__init__(*args, **kwargs)
-
-    @classmethod
-    def tearDownClass(cls):
-        try:
-            os.environ.pop('TESTING')
-        except:
-            pass
-
-    def _get_Xy(self):
-        X, y = make_classification(n_samples=1500, n_features=3, n_informative=3,
-                                   n_redundant=0, n_repeated=0, n_classes=2, n_clusters_per_class=2,
-                                   class_sep=1.5, flip_y=0, weights=[0.5, 0.5], random_state=self._random_state)
-        return X, y
-
-    def test_attributes_in_leaves(self):
-        """Check if the attributes in leaves have correct values so they form a predictor
-        """
-        def check_leave(node: Snode):
-            if not node.is_leaf():
-                check_leave(node.get_down())
-                check_leave(node.get_up())
-                return
-            # Check Belief in leave
-            classes, card = np.unique(node._y, return_counts=True)
-            max_card = max(card)
-            min_card = min(card)
-            if len(classes) > 1:
-                try:
-                    belief = max_card / (max_card + min_card)
-                except:
-                    belief = 0.
-            else:
-                belief = 1
-            self.assertEqual(belief, node._belief)
-            # Check Class
-            class_computed = classes[card == max_card]
-            self.assertEqual(class_computed, node._class)
-        check_leave(self._clf._tree)
-    
-    def test_nodes_coefs(self):
-        """Check if the nodes of the tree have the right attributes filled
-        """
-        def run_tree(node: Snode):
-            if node._belief < 1:
-                # only exclude pure leaves
-                self.assertIsNotNone(node._clf)
-                self.assertIsNotNone(node._clf.coef_)
-                self.assertIsNotNone(node._vector)
-                self.assertIsNotNone(node._interceptor)
-            if node.is_leaf():
-                return
-            run_tree(node.get_down())
-            run_tree(node.get_up())
-        run_tree(self._clf._tree)
--- a/tests/Stree_test.py
+++ b/tests/Stree_test.py
@@ -1,223 +0,0 @@
-import csv
-import os
-import unittest
-
-import numpy as np
-from sklearn.datasets import make_classification
-
-from trees.Stree import Stree, Snode
-
-
-class Stree_test(unittest.TestCase):
-
-    def __init__(self, *args, **kwargs):
-        os.environ['TESTING'] = '1'
-        self._random_state = 1
-        self._clf = Stree(random_state=self._random_state,
-                            use_predictions=False)
-        self._clf.fit(*self._get_Xy())
-        super(Stree_test, self).__init__(*args, **kwargs)
-
-    @classmethod
-    def tearDownClass(cls):
-        try:
-            os.environ.pop('TESTING')
-        except:
-            pass
-        
-    def _get_Xy(self):
-        X, y = make_classification(n_samples=1500, n_features=3, n_informative=3,
-                                   n_redundant=0, n_repeated=0, n_classes=2, n_clusters_per_class=2,
-                                   class_sep=1.5, flip_y=0, weights=[0.5, 0.5], random_state=self._random_state)
-        return X, y
-
-    def _check_tree(self, node: Snode):
-        """Check recursively that the nodes that are not leaves have the correct 
-        number of labels and its sons have the right number of elements in their dataset
-
-        Arguments:
-            node {Snode} -- node to check
-        """
-        if node.is_leaf():
-            return
-        y_prediction = node._clf.predict(node._X)
-        y_down = node.get_down()._y
-        y_up = node.get_up()._y
-        # Is a correct partition in terms of cadinality?
-        # i.e. The partition algorithm didn't forget any sample
-        self.assertEqual(node._y.shape[0], y_down.shape[0] + y_up.shape[0])
-        unique_y, count_y = np.unique(node._y, return_counts=True)
-        _, count_d = np.unique(y_down, return_counts=True)
-        _, count_u = np.unique(y_up, return_counts=True)
-        #
-        for i in unique_y:
-            try:
-                number_down = count_d[i]
-            except:
-                number_down = 0
-            try:
-                number_up = count_u[i]
-            except:
-                number_up = 0
-            self.assertEqual(count_y[i], number_down + number_up)
-        # Is the partition made the same as the prediction?
-        # as the node is not a leaf...
-        _, count_yp = np.unique(y_prediction, return_counts=True)
-        self.assertEqual(count_yp[0], y_up.shape[0])
-        self.assertEqual(count_yp[1], y_down.shape[0])
-        self._check_tree(node.get_down())
-        self._check_tree(node.get_up())
-
-    def test_build_tree(self):
-        """Check if the tree is built the same way as predictions of models
-        """
-        self._check_tree(self._clf._tree)
-
-    def _get_file_data(self, file_name: str) -> tuple:
-        """Return X, y from data, y is the last column in array
-
-        Arguments:
-            file_name {str} -- the file name
-
-        Returns:
-            tuple -- tuple with samples, categories
-        """
-        data = np.genfromtxt(file_name, delimiter=',')
-        data = np.array(data)
-        column_y = data.shape[1] - 1
-        fy = data[:, column_y]
-        fx = np.delete(data, column_y, axis=1)
-        return fx, fy
-
-    def _find_out(self, px: np.array, x_original: np.array, y_original) -> list:
-        """Find the original values of y for a given array of samples
-
-        Arguments:
-            px {np.array} -- array of samples to search for
-            x_original {np.array} -- original dataset
-            y_original {[type]} -- original classes
-
-        Returns:
-            np.array -- classes of the given samples
-        """
-        res = []
-        for needle in px:
-            for row in range(x_original.shape[0]):
-                if all(x_original[row, :] == needle):
-                    res.append(y_original[row])
-        return res
-
-    def test_subdatasets(self):
-        """Check if the subdatasets files have the same labels as the original dataset
-        """
-        self._clf.save_sub_datasets()
-        with open(self._clf.get_catalog_name()) as cat_file:
-            catalog = csv.reader(cat_file, delimiter=',')
-            for row in catalog:
-                X, y = self._get_Xy()
-                x_file, y_file = self._get_file_data(row[0])
-                y_original = np.array(self._find_out(x_file, X, y), dtype=int)
-                self.assertTrue(np.array_equal(y_file, y_original))
-    
-    def test_single_prediction(self):
-        X, y = self._get_Xy()
-        yp = self._clf.predict((X[0, :].reshape(-1, X.shape[1])))
-        self.assertEqual(yp[0], y[0])
-
-    def test_multiple_prediction(self):
-        # First 27 elements the predictions are the same as the truth
-        num = 27
-        X, y = self._get_Xy()
-        yp = self._clf.predict(X[:num, :])
-        self.assertListEqual(y[:num].tolist(), yp.tolist())
-
-    def test_score(self):
-        X, y = self._get_Xy()
-        accuracy_score = self._clf.score(X, y)
-        yp = self._clf.predict(X)
-        right = (yp == y).astype(int)
-        accuracy_computed = sum(right) / len(y)
-        self.assertEqual(accuracy_score, accuracy_computed)
-        self.assertGreater(accuracy_score, 0.8)
-    
-    def test_single_predict_proba(self):
-        """Check that element 28 has a prediction different that the current label
-        """
-        # Element 28 has a different prediction than the truth
-        X, y = self._get_Xy()
-        yp = self._clf.predict_proba(X[28, :].reshape(-1, X.shape[1]))
-        self.assertEqual(0, yp[0:, 0])
-        self.assertEqual(1, y[28])
-        self.assertEqual(0.29026400766, round(yp[0, 1], 11))
-
-    def test_multiple_predict_proba(self):
-        # First 27 elements the predictions are the same as the truth
-        num = 27
-        X, y = self._get_Xy()
-        yp = self._clf.predict_proba(X[:num, :])
-        self.assertListEqual(y[:num].tolist(), yp[:, 0].tolist())
-        expected_proba = [0.88395641, 0.36746962, 0.84158767, 0.34106833, 0.14269291, 0.85193236,
-                        0.29876058, 0.7282164,  0.85958616, 0.89517877, 0.99745224, 0.18860349,
-                        0.30756427, 0.8318412,  0.18981198, 0.15564624, 0.25740655, 0.22923355,
-                        0.87365959, 0.49928689, 0.95574351, 0.28761257, 0.28906333, 0.32643692,
-                        0.29788483, 0.01657364, 0.81149083]
-        self.assertListEqual(expected_proba, np.round(yp[:, 1], decimals=8).tolist())
-
-    def build_models(self):
-        """Build and train two models, model_clf will use the sklearn classifier to
-        compute predictions and split data. model_computed will use vector of
-        coefficients to compute both predictions and splitted data
-        """
-        model_clf = Stree(random_state=self._random_state,
-                            use_predictions=True)
-        model_computed = Stree(random_state=self._random_state,
-                            use_predictions=False)
-        X, y = self._get_Xy()
-        model_clf.fit(X, y)
-        model_computed.fit(X, y)
-        return model_clf, model_computed, X, y
-
-    def test_use_model_predict(self):
-        """Check that we get the same results wether we use the estimator in nodes
-        to compute labels or we use the hyperplane and the position of samples wrt to it
-        """
-        use_clf, use_math, X, _ = self.build_models()
-        self.assertListEqual(
-            use_clf.predict(X).tolist(),
-            use_math.predict(X).tolist()
-        )
-    
-    def test_use_model_score(self):
-        use_clf, use_math, X, y = self.build_models()
-        b = use_math.score(X, y)
-        self.assertEqual(
-            use_clf.score(X, y),
-           b
-        )
-        self.assertGreater(b, .95)
-
-    def test_use_model_predict_proba(self):
-        use_clf, use_math, X, _ = self.build_models()
-        self.assertListEqual(
-            use_clf.predict_proba(X).tolist(),
-            use_math.predict_proba(X).tolist()
-        )
-
-    def test_single_vs_multiple_prediction(self):
-        """Check if predicting sample by sample gives the same result as predicting
-        all samples at once
-        """
-        X, _ = self._get_Xy()
-        # Compute prediction line by line
-        yp_line = np.array([], dtype=int)
-        for xp in X:
-            yp_line = np.append(yp_line, self._clf.predict(xp.reshape(-1, X.shape[1])))
-        # Compute prediction at once
-        yp_once = self._clf.predict(X)
-        #
-        self.assertListEqual(yp_line.tolist(), yp_once.tolist())
-
-
-
-
-
--- a/tests/init.py
+++ b/tests/init.py
--- a/trees/Siterator.py
+++ b/trees/Siterator.py
@@ -1,34 +0,0 @@
-'''
-__author__ = "Ricardo Montañana Gómez"
-__copyright__ = "Copyright 2020, Ricardo Montañana Gómez"
-__license__ = "MIT"
-__version__ = "0.9"
-Inorder iterator for the binary tree of Snodes
-Uses LinearSVC
-'''
-
-from trees.Snode import Snode
-
-
-class Siterator:
-    """Inorder iterator
-    """
-
-    def __init__(self, tree: Snode):
-        self._stack = []
-        self._push(tree)
-
-    def __iter__(self):
-        return self
-
-    def _push(self, node: Snode):
-        while (node is not None):
-            self._stack.insert(0, node)
-            node = node.get_down()
-
-    def __next__(self) -> Snode:
-        if len(self._stack) == 0:
-            raise StopIteration()
-        node = self._stack.pop()
-        self._push(node.get_up())
-        return node
--- a/trees/Snode.py
+++ b/trees/Snode.py
@@ -1,70 +0,0 @@
-'''
-__author__ = "Ricardo Montañana Gómez"
-__copyright__ = "Copyright 2020, Ricardo Montañana Gómez"
-__license__ = "MIT"
-__version__ = "0.9"
-Node of the Stree (binary tree)
-'''
-
-import os
-
-import numpy as np
-from sklearn.svm import LinearSVC
-
-
-class Snode:
-    def __init__(self, clf: LinearSVC, X: np.ndarray, y: np.ndarray, title: str):
-        self._clf = clf
-        self._vector = None if clf is None else clf.coef_
-        self._interceptor = 0. if clf is None else clf.intercept_
-        self._title = title
-        self._belief = 0.  # belief of the prediction in a leaf node based on samples
-        # Only store dataset in Testing 
-        self._X = X if os.environ.get('TESTING', 'NS') != 'NS' else None
-        self._y = y
-        self._down = None
-        self._up = None
-        self._class = None
-
-    def set_down(self, son):
-        self._down = son
-
-    def set_up(self, son):
-        self._up = son
-
-    def is_leaf(self,) -> bool:
-        return self._up is None and self._down is None
-
-    def get_down(self) -> 'Snode':
-        return self._down
-
-    def get_up(self) -> 'Snode':
-        return self._up
-
-    def make_predictor(self):
-        """Compute the class of the predictor and its belief based on the subdataset of the node
-        only if it is a leaf
-        """
-        # Clean memory
-        #self._X = None
-        #self._y = None
-        if not self.is_leaf():
-            return
-        classes, card = np.unique(self._y, return_counts=True)
-        if len(classes) > 1:
-            max_card = max(card)
-            min_card = min(card)
-            try:
-                self._belief = max_card / (max_card + min_card)
-            except:
-                self._belief = 0.
-            self._class = classes[card == max_card][0]
-        else:
-            self._belief = 1
-            self._class = classes[0]
-
-    def __str__(self) -> str:
-        if self.is_leaf():
-            return f"{self._title} - Leaf class={self._class} belief={self._belief:.6f} counts={np.unique(self._y, return_counts=True)}"
-        else:
-            return f"{self._title}"
--- a/trees/Stree.py
+++ b/trees/Stree.py
@@ -1,222 +0,0 @@
-'''
-__author__ = "Ricardo Montañana Gómez"
-__copyright__ = "Copyright 2020, Ricardo Montañana Gómez"
-__license__ = "MIT"
-__version__ = "0.9"
-Build an oblique tree classifier based on SVM Trees
-Uses LinearSVC
-'''
-
-import typing
-
-import numpy as np
-from sklearn.base import BaseEstimator, ClassifierMixin
-from sklearn.svm import LinearSVC
-from sklearn.utils.validation import check_X_y, check_array, check_is_fitted
-
-from trees.Snode import Snode
-from trees.Siterator import Siterator
-
-
-class Stree(BaseEstimator, ClassifierMixin):
-    """
-    """
-
-    def __init__(self, C=1.0, max_iter: int = 1000, random_state: int = 0, use_predictions: bool = False):
-        self._max_iter = max_iter
-        self._C = C
-        self._random_state = random_state
-        self._tree = None
-        self.__folder = 'data/'
-        self.__use_predictions = use_predictions
-        self.__trained = False
-        self.__proba = False
-
-    def get_params(self, deep=True):
-        """Get dict with hyperparameters and its values to accomplish sklearn rules
-        """
-        return {"C": self._C, "random_state": self._random_state, 'max_iter': self._max_iter}
-
-    def set_params(self, **parameters):
-        """Set hyperparmeters as specified by sklearn, needed in Gridsearchs
-        """
-        for parameter, value in parameters.items():
-            setattr(self, parameter, value)
-        return self
-
-    def _linear_function(self, data: np.array, node: Snode) -> np.array:
-        coef = node._vector[0, :].reshape(-1, data.shape[1])
-        return data.dot(coef.T) + node._interceptor[0]
-
-    def _split_data(self, node: Snode, data: np.ndarray, indices: np.ndarray) -> list:
-        if self.__use_predictions:
-            yp = node._clf.predict(data)
-            down = (yp == 1).reshape(-1, 1)
-            res = np.expand_dims(node._clf.decision_function(data), 1)
-        else:
-            # doesn't work with multiclass as each sample has to do inner product with its own coeficients
-            # computes positition of every sample is w.r.t. the hyperplane
-            res = self._linear_function(data, node)
-            down = res > 0
-        up = ~down
-        data_down = data[down[:, 0]] if any(down) else None
-        indices_down = indices[down[:, 0]] if any(down) else None
-        res_down = res[down[:, 0]] if any(down) else None
-        data_up = data[up[:, 0]] if any(up) else None
-        indices_up = indices[up[:, 0]] if any(up) else None
-        res_up = res[up[:, 0]] if any(up) else None
-        return [data_up, indices_up, data_down, indices_down, res_up, res_down]
-
-    def fit(self, X: np.ndarray, y: np.ndarray, title: str = 'root') -> 'Stree':
-        X, y = check_X_y(X, y.ravel())
-        self.n_features_in_ = X.shape[1]
-        self._tree = self.train(X, y.ravel(), title)
-        self._build_predictor()
-        self.__trained = True
-        return self
-
-    def _build_predictor(self):
-        """Process the leaves to make them predictors
-        """
-        def run_tree(node: Snode):
-            if node.is_leaf():
-                node.make_predictor()
-                return
-            run_tree(node.get_down())
-            run_tree(node.get_up())
-        run_tree(self._tree)
-
-    def train(self, X: np.ndarray, y: np.ndarray, title: str = 'root') -> Snode:
-        if np.unique(y).shape[0] == 1:
-            # only 1 class => pure dataset
-            return Snode(None, X, y, title + ', <pure>')
-        # Train the model
-        clf = LinearSVC(max_iter=self._max_iter, C=self._C,
-                        random_state=self._random_state)
-        clf.fit(X, y)
-        tree = Snode(clf, X, y, title)
-        X_U, y_u, X_D, y_d, _, _ = self._split_data(tree, X, y)
-        if X_U is None or X_D is None:
-            # didn't part anything
-            return Snode(clf, X, y, title + ', <cgaf>')
-        tree.set_up(self.train(X_U, y_u, title + ' - Up'))
-        tree.set_down(self.train(X_D, y_d, title + ' - Down'))
-        return tree
-
-    def _reorder_results(self, y: np.array, indices: np.array) -> np.array:
-        y_ordered = np.zeros(y.shape, dtype=int if y.ndim == 1 else float)
-        indices = indices.astype(int)
-        for i, index in enumerate(indices):
-            y_ordered[index] = y[i]
-        return y_ordered
-
-    def predict(self, X: np.array) -> np.array:
-        def predict_class(xp: np.array, indices: np.array, node: Snode) -> np.array:
-            if xp is None:
-                return [], []
-            if node.is_leaf():
-                # set a class for every sample in dataset
-                prediction = np.full((xp.shape[0], 1), node._class)
-                return prediction, indices
-            u, i_u, d, i_d, _, _ = self._split_data(node, xp, indices)
-            k, l = predict_class(d, i_d, node.get_down())
-            m, n = predict_class(u, i_u, node.get_up())
-            return np.append(k, m), np.append(l, n)
-        # sklearn check
-        check_is_fitted(self)
-        # Input validation
-        X = check_array(X)
-        # setup prediction & make it happen
-        indices = np.arange(X.shape[0])
-        return self._reorder_results(*predict_class(X, indices, self._tree))
-
-    def predict_proba(self, X: np.array) -> np.array:
-        """Computes an approximation of the probability of samples belonging to class 1 
-        (nothing more, nothing less)
-
-        :param X: dataset
-        :type X: np.array
-        """
-        def predict_class(xp: np.array, indices: np.array, dist: np.array, node: Snode) -> np.array:
-            """Run the tree to compute predictions
-
-            :param xp: subdataset of samples
-            :type xp: np.array
-            :param indices: indices of subdataset samples to rebuild original order
-            :type indices: np.array
-            :param dist: distances of every sample to the hyperplane or the father node
-            :type dist: np.array
-            :param node: node of the leaf with the class
-            :type node: Snode
-            :return: array of labels and distances, array of indices
-            :rtype: np.array
-            """
-            if xp is None:
-                return [], []
-            if node.is_leaf():
-                # set a class for every sample in dataset
-                prediction = np.full((xp.shape[0], 1), node._class)
-                prediction_proba = dist
-                return np.append(prediction, prediction_proba, axis=1), indices
-            u, i_u, d, i_d, r_u, r_d = self._split_data(node, xp, indices)
-            k, l = predict_class(d, i_d, r_d, node.get_down())
-            m, n = predict_class(u, i_u, r_u, node.get_up())
-            return np.append(k, m), np.append(l, n)
-        # sklearn check
-        check_is_fitted(self)
-        # Input validation
-        X = check_array(X)
-        # setup prediction & make it happen
-        indices = np.arange(X.shape[0])
-        result, indices = predict_class(X, indices, [], self._tree)
-        result = result.reshape(X.shape[0], 2)
-        # Turn distances to hyperplane into probabilities based on fitting distances
-        # of samples to its hyperplane that classified them, to the sigmoid function
-        result[:, 1] = 1 / (1 + np.exp(-result[:, 1]))
-        return self._reorder_results(result, indices)
-
-    def score(self, X: np.array, y: np.array) -> float:
-        """Return accuracy
-        """
-        if not self.__trained:
-            self.fit(X, y)
-        yp = self.predict(X).reshape(y.shape)
-        right = (yp == y).astype(int)
-        return np.sum(right) / len(y)
-
-    def __iter__(self):
-        return Siterator(self._tree)
-
-    def __str__(self) -> str:
-        output = ''
-        for i in self:
-            output += str(i) + '\n'
-        return output
-
-    def _save_datasets(self, tree: Snode, catalog: typing.TextIO, number: int):
-        """Save the dataset of the node in a csv file
-
-        :param tree: node with data to save
-        :type tree: Snode
-        :param catalog: catalog file handler
-        :type catalog: typing.TextIO
-        :param number: sequential number for the generated file name
-        :type number: int
-        """
-        data = np.append(tree._X, tree._y.reshape(-1, 1), axis=1)
-        name = f"{self.__folder}dataset{number}.csv"
-        np.savetxt(name, data, delimiter=",")
-        catalog.write(f"{name}, - {str(tree)}")
-        if tree.is_leaf():
-            return
-        self._save_datasets(tree.get_down(), catalog, number + 1)
-        self._save_datasets(tree.get_up(), catalog, number + 2)
-
-    def get_catalog_name(self):
-        return self.__folder + "catalog.txt"
-
-    def save_sub_datasets(self):
-        """Save the every dataset stored in the tree to check with manual classifier
-        """
-        with open(self.get_catalog_name(), 'w', encoding='utf-8') as catalog:
-            self._save_datasets(self._tree, catalog, 1)
--- a/trees/init.py
+++ b/trees/init.py
Author	SHA1	Message	Date
Ricardo Montañana	1d392d534f	#6 - Update tests and codecov conf	2020-06-11 13:45:24 +02:00
Ricardo Montañana	f360a2640c	#6 - Add multiclass support Removed (by now) predict_proba. Created a notebook in jupyter Added split_criteria parameter with min_distance and max_samples values Refactor _distances Refactor _split_criteria Refactor _reorder_results	2020-06-11 13:10:52 +02:00
Ricardo Montañana Gómez	45510b43bc	Merge pull request #5 from Doctorado-ML/add_kernels #3 Add kernels to STree	2020-06-09 13:43:31 +02:00
Ricardo Montañana	286a91a3d7	#3 refactor unneeded code and new test	2020-06-09 13:01:01 +02:00
Ricardo Montañana	5c31c2b2a5	#3 update features notebook	2020-06-09 02:12:56 +02:00
Ricardo Montañana	7e932de072	#3 Add sample_weights to score, update notebooks Update readme to use new names of notebooks	2020-06-09 01:46:38 +02:00
Ricardo Montañana	26273e936a	#3 Add degree hyperparam and update notebooks Update readme to add new notebooks	2020-06-08 20:16:42 +02:00
Ricardo Montañana	d7c0bc3bc5	#3 Complete multiclass in Stree Add multiclass dimensions management in distances method Add gamma hyperparameter for non linear kernels	2020-06-08 13:54:24 +02:00
Ricardo Montañana	3a48d8b405	#3 Rewrite some tests & remove use_predictions Remove use_predictions parameter as of now, the model always use it	2020-06-08 01:51:21 +02:00
Ricardo Montañana	05b462716e	#3 First try, change LinearSVC to SVC make a builder start changing tests	2020-06-07 20:26:59 +02:00
Ricardo Montañana	b824229121	#1 Add min_samples_split Fix #1	2020-06-07 16:12:25 +02:00
Ricardo Montañana	8ba9b1b6a1	Remove travis ci and set codecov percentage	2020-06-06 19:47:00 +02:00
Ricardo Montañana	37577849db	Fix parameter missing in method overload	2020-06-06 18:18:03 +02:00
Ricardo Montañana	cb10aea36e	remove unneed test and cosmetic	2020-06-06 14:20:07 +02:00
Ricardo Montañana	b9f14aec05	#4 Add code coverage & codacy badge Add code coverage configuration in codecov Add some tests	2020-06-06 03:04:18 +02:00
Ricardo Montañana	b4816b2995	Show sample_weight use in test2 notebook Update revision to RC4 Lint Stree grapher	2020-05-30 23:59:40 +02:00
Ricardo Montañana	5e5fea9c6a	Document & lint code	2020-05-30 23:10:10 +02:00
Ricardo Montañana	724a4855fb	Adapt some notebooks	2020-05-30 11:09:59 +02:00
Ricardo Montañana	a22ae81b54	Refactor split_data adding sample_weight	2020-05-29 18:52:23 +02:00
Ricardo Montañana	ed98054f0d	First approach Added max_depth, tol, weighted samples	2020-05-29 12:46:10 +02:00
Ricardo Montañana	e95bd9697a	Make Stree a sklearn estimator Added check_estimator in notebook test2 Added a Stree test with check_estimator	2020-05-25 19:51:39 +02:00
Ricardo Montañana	5956cd0cd2	Update google colab setup in notebooks Undate save_all in grapher to make dest. folder if it doesn't exist	2020-05-24 20:13:27 +02:00
Ricardo Montañana	27b278860d	Fix install from scratch	2020-05-24 18:47:55 +02:00
Ricardo Montañana	d5d723c67f	update setup.py to include tests suite	2020-05-23 23:59:03 +02:00
Ricardo Montañana	77f10281c1	Make project python package friendly - Add setup.py - Move classes to module files - Move tests folder inside module folder	2020-05-23 23:40:33 +02:00
Ricardo Montañana	ac1483ae1d	update requirements to alllow maptlot widget	2020-05-23 00:05:58 +02:00
Ricardo Montañana	e51690ed95	Implement grapher and notebook to test it	2020-05-22 19:42:13 +02:00
Ricardo Montañana	a4595f5815	Update notebooks and readme with cosmetic changes	2020-05-20 18:11:57 +02:00
Ricardo Montañana	316f84cc63	Fix precision issues in tests executed in Travis	2020-05-20 15:02:31 +02:00
Ricardo Montañana	6e35628c85	Grapher working	2020-05-20 14:26:55 +02:00
Ricardo Montañana	c0ef71f139	first approx to grapher	2020-05-20 12:32:17 +02:00