mirror of
https://github.com/Doctorado-ML/Stree_datasets.git
synced 2025-08-18 00:46:03 +00:00
Commit Inicial
This commit is contained in:
3428
data/tanveer/thyroid/ann-test.data
Executable file
3428
data/tanveer/thyroid/ann-test.data
Executable file
File diff suppressed because it is too large
Load Diff
94
data/tanveer/thyroid/ann-thyroid.names
Executable file
94
data/tanveer/thyroid/ann-thyroid.names
Executable file
@@ -0,0 +1,94 @@
|
||||
NOTE: all files associated to this .names file have the ann- prefix.
|
||||
|
||||
1. Title: Thyroid Domain
|
||||
|
||||
2. Sources:
|
||||
|
||||
(a) Donors: Randolf Werner
|
||||
evol@uniko.uni-koblenz.de
|
||||
(b) Obtained from Daimler-Benz.
|
||||
(c) Date: October 1992
|
||||
|
||||
3. Past Usage:
|
||||
|
||||
(a) "Optimization of the Backpropagation Algorithm for Training Multilayer
|
||||
Perceptrons":
|
||||
|
||||
ftp archive.cis.ohio-state.edu or ftp 128.146.8.52
|
||||
cd pub/neuroprose
|
||||
binary
|
||||
get schiff.bp_speedup.ps.Z
|
||||
quit
|
||||
|
||||
The report is an overview of many different backprop speedup techniques.
|
||||
15 different algorithms are described in detail and compared by using
|
||||
a big, very hard to solve, practical data set. Learning speed and network
|
||||
classification performance with respect to the training data set and also
|
||||
with respect to a testing data set are discussed.
|
||||
These are the tested algorithms:
|
||||
|
||||
backprop
|
||||
backprop (batch mode)
|
||||
backprop + Learning rate calculated by Eaton and Oliver's formula
|
||||
backprop + decreasing learning rate (Darken and Moody)
|
||||
backprop + Learning rate adaptation for each training pattern (J. Schmidhuber)
|
||||
backprop + evolutionarily learning rate adaptation (R. Salomon)
|
||||
backprop + angle driven learning rate adaptation(Chan and Fallside)
|
||||
Polak-Ribiere + line search (Kramer and Vincentelli)
|
||||
Conj. gradient + line search (Leonard and Kramer)
|
||||
backprop + learning rate adaptation by sign changes (Silva and Almeida)
|
||||
SuperSAB (T. Tollenaere)
|
||||
Delta-Bar-Delta (Jacobs)
|
||||
RPROP (Riedmiller and Braun)
|
||||
Quickprop (Fahlman)
|
||||
Cascade correlation (Fahlman)
|
||||
|
||||
(b) "Synthesis and Performance Analysis of Multilayer eural Network Architectures":
|
||||
|
||||
ftp archive.cis.ohio-state.edu or ftp 128.146.8.52
|
||||
cd pub/neuroprose
|
||||
binary
|
||||
get schiff.gann.ps.Z
|
||||
quit
|
||||
|
||||
In this paper we present various approaches for automatic topology-optimization
|
||||
of backpropagation networks. First of all, we review the basics of genetic
|
||||
algorithms which are our essential tool for a topology search. Then we give a
|
||||
survey of backprop and the topological properties of feedforward networks. We
|
||||
report on pioneer work in the filed of topology--optimization. Our first
|
||||
approach was based on evolutions strategies which used only mutation to change
|
||||
the parent's topologies. Now, we found a way to extend this approach by an
|
||||
crossover operator which is essential to all genetic search methods.
|
||||
In contrast to competing approaches it allows that two parent networks with
|
||||
different number of units can mate and produce a (valid) child network, which
|
||||
inherits genes from both of the parents. We applied our genetic algorithm to a
|
||||
medical classification problem which is extremly difficult to solve. The
|
||||
performance with respect to the training set and a test set of pattern samples
|
||||
was compared to fixed network topologies. Our results confirm that the topology
|
||||
optimization makes sense, because the generated networks outperform the fixed
|
||||
topologies and reach classification performances near optimum.
|
||||
|
||||
4. Relevant Information:
|
||||
|
||||
The problem is to determine whether a patient referred to the clinic is
|
||||
hypothyroid. Therefore three classes are built: normal (not hypothyroid),
|
||||
hyperfunction and subnormal functioning. Because 92 percent of the patients
|
||||
are not hyperthyroid a good classifier must be significant better than 92%.
|
||||
|
||||
Note
|
||||
|
||||
These are the attributes Quinlans used in the case study of his article
|
||||
"Simplifying Decision Trees" (International Journal of Man-Machine Studies
|
||||
(1987) 221-234). Unfortunately this data differ from the version already
|
||||
present (donated by Ross Quinlan) I (Randolf Werner) don't know any more
|
||||
details about the dataset. But it's hard to train Backpropagation ANNs with
|
||||
it. The dataset is used in two technical reports (see above).
|
||||
|
||||
5. Number of Instances: ann-train.data: 3772, ann-test.data: 3428
|
||||
|
||||
6. Number of Classes: 3
|
||||
|
||||
7. Number of Attributes: 21 (15 attributes are binary,
|
||||
6 attributes are continuous)
|
||||
|
||||
|
3772
data/tanveer/thyroid/ann-train.data
Executable file
3772
data/tanveer/thyroid/ann-train.data
Executable file
File diff suppressed because it is too large
Load Diff
2
data/tanveer/thyroid/conxuntos.dat
Executable file
2
data/tanveer/thyroid/conxuntos.dat
Executable file
File diff suppressed because one or more lines are too long
8
data/tanveer/thyroid/conxuntos_kfold.dat
Executable file
8
data/tanveer/thyroid/conxuntos_kfold.dat
Executable file
File diff suppressed because one or more lines are too long
22
data/tanveer/thyroid/le_datos.m
Executable file
22
data/tanveer/thyroid/le_datos.m
Executable file
@@ -0,0 +1,22 @@
|
||||
printf('lendo problema %s ...\n', problema);
|
||||
|
||||
n_entradas= 21; n_clases= 3;
|
||||
n_fich= 2; fich{1}= 'ann-train.data'; n_patrons(1)= 3772; fich{2}= 'ann-test.data'; n_patrons(2)= 3428;
|
||||
n_max= max(n_patrons);
|
||||
x = zeros(n_fich, n_max, n_entradas); cl= zeros(n_fich, n_max);
|
||||
n_patrons_total = sum(n_patrons); n_iter=0;
|
||||
|
||||
for i_fich=1:n_fich
|
||||
f=fopen(fich{i_fich}, 'r');
|
||||
if -1==f
|
||||
error('erro en fopen abrindo %s\n', fich{i_fich});
|
||||
end
|
||||
for i=1:n_patrons(i_fich)
|
||||
fprintf(2,'%5.1f%%\r', 100*n_iter++/n_patrons_total);
|
||||
for j = 1:n_entradas
|
||||
x(i_fich,i,j) = fscanf(f,'%g',1);
|
||||
end
|
||||
cl(i_fich,i) = fscanf(f,'%i',1) - 1; % lectura da clase
|
||||
end
|
||||
fclose(f);
|
||||
end
|
6
data/tanveer/thyroid/thyroid.cost
Executable file
6
data/tanveer/thyroid/thyroid.cost
Executable file
@@ -0,0 +1,6 @@
|
||||
% Rows Columns
|
||||
3 3
|
||||
% Matrix elements
|
||||
0.0 1.0 1.0
|
||||
1.0 0.0 1.0
|
||||
1.0 1.0 0.0
|
10
data/tanveer/thyroid/thyroid.txt
Executable file
10
data/tanveer/thyroid/thyroid.txt
Executable file
@@ -0,0 +1,10 @@
|
||||
n_entradas= 21
|
||||
n_clases= 3
|
||||
n_arquivos= 2
|
||||
fich1= thyroid_train_R.dat
|
||||
n_patrons1= 3772
|
||||
fich2= thyroid_test_R.dat
|
||||
n_patrons2= 3428
|
||||
n_patrons_entrena= 1886
|
||||
n_patrons_valida= 1886
|
||||
n_conxuntos= 1
|
3452
data/tanveer/thyroid/thyroid_test.arff
Executable file
3452
data/tanveer/thyroid/thyroid_test.arff
Executable file
File diff suppressed because it is too large
Load Diff
3429
data/tanveer/thyroid/thyroid_test_R.dat
Executable file
3429
data/tanveer/thyroid/thyroid_test_R.dat
Executable file
File diff suppressed because it is too large
Load Diff
3796
data/tanveer/thyroid/thyroid_train.arff
Executable file
3796
data/tanveer/thyroid/thyroid_train.arff
Executable file
File diff suppressed because it is too large
Load Diff
3773
data/tanveer/thyroid/thyroid_train_R.dat
Executable file
3773
data/tanveer/thyroid/thyroid_train_R.dat
Executable file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user