mirror of
https://github.com/Doctorado-ML/Stree_datasets.git
synced 2025-08-18 08:56:01 +00:00
Commit Inicial
This commit is contained in:
93
data/tanveer/optical/optdigits.names
Executable file
93
data/tanveer/optical/optdigits.names
Executable file
@@ -0,0 +1,93 @@
|
||||
|
||||
1. Title of Database: Optical Recognition of Handwritten Digits
|
||||
|
||||
2. Source:
|
||||
E. Alpaydin, C. Kaynak
|
||||
Department of Computer Engineering
|
||||
Bogazici University, 80815 Istanbul Turkey
|
||||
alpaydin@boun.edu.tr
|
||||
July 1998
|
||||
|
||||
3. Past Usage:
|
||||
C. Kaynak (1995) Methods of Combining Multiple Classifiers and Their
|
||||
Applications to Handwritten Digit Recognition,
|
||||
MSc Thesis, Institute of Graduate Studies in Science and
|
||||
Engineering, Bogazici University.
|
||||
|
||||
E. Alpaydin, C. Kaynak (1998) Cascading Classifiers, Kybernetika,
|
||||
to appear. ftp://ftp.icsi.berkeley.edu/pub/ai/ethem/kyb.ps.Z
|
||||
|
||||
4. Relevant Information:
|
||||
We used preprocessing programs made available by NIST to extract
|
||||
normalized bitmaps of handwritten digits from a preprinted form. From
|
||||
a total of 43 people, 30 contributed to the training set and different
|
||||
13 to the test set. 32x32 bitmaps are divided into nonoverlapping
|
||||
blocks of 4x4 and the number of on pixels are counted in each block.
|
||||
This generates an input matrix of 8x8 where each element is an
|
||||
integer in the range 0..16. This reduces dimensionality and gives
|
||||
invariance to small distortions.
|
||||
|
||||
For info on NIST preprocessing routines, see
|
||||
M. D. Garris, J. L. Blue, G. T. Candela, D. L. Dimmick, J. Geist,
|
||||
P. J. Grother, S. A. Janet, and C. L. Wilson, NIST Form-Based
|
||||
Handprint Recognition System, NISTIR 5469, 1994.
|
||||
|
||||
5. Number of Instances
|
||||
optdigits.tra Training 3823
|
||||
optdigits.tes Testing 1797
|
||||
|
||||
The way we used the dataset was to use half of training for
|
||||
actual training, one-fourth for validation and one-fourth
|
||||
for writer-dependent testing. The test set was used for
|
||||
writer-independent testing and is the actual quality measure.
|
||||
|
||||
6. Number of Attributes
|
||||
64 input+1 class attribute
|
||||
|
||||
7. For Each Attribute:
|
||||
All input attributes are integers in the range 0..16.
|
||||
The last attribute is the class code 0..9
|
||||
|
||||
8. Missing Attribute Values
|
||||
None
|
||||
|
||||
9. Class Distribution
|
||||
Class: No of examples in training set
|
||||
0: 376
|
||||
1: 389
|
||||
2: 380
|
||||
3: 389
|
||||
4: 387
|
||||
5: 376
|
||||
6: 377
|
||||
7: 387
|
||||
8: 380
|
||||
9: 382
|
||||
|
||||
Class: No of examples in testing set
|
||||
0: 178
|
||||
1: 182
|
||||
2: 177
|
||||
3: 183
|
||||
4: 181
|
||||
5: 182
|
||||
6: 181
|
||||
7: 179
|
||||
8: 174
|
||||
9: 180
|
||||
|
||||
Accuracy on the testing set with k-nn
|
||||
using Euclidean distance as the metric
|
||||
|
||||
k = 1 : 98.00
|
||||
k = 2 : 97.38
|
||||
k = 3 : 97.83
|
||||
k = 4 : 97.61
|
||||
k = 5 : 97.89
|
||||
k = 6 : 97.77
|
||||
k = 7 : 97.66
|
||||
k = 8 : 97.66
|
||||
k = 9 : 97.72
|
||||
k = 10 : 97.55
|
||||
k = 11 : 97.89
|
||||
|
Reference in New Issue
Block a user