mirror of
https://github.com/Doctorado-ML/Stree_datasets.git
synced 2025-08-19 01:16:01 +00:00
Commit Inicial
This commit is contained in:
61
data/tanveer/lung-cancer/lung-cancer.names
Executable file
61
data/tanveer/lung-cancer/lung-cancer.names
Executable file
@@ -0,0 +1,61 @@
|
||||
|
||||
1. Title: Lung Cancer Data
|
||||
|
||||
2. Source Information:
|
||||
- Data was published in :
|
||||
Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small
|
||||
Number of Samples and Design Method of Classifier on the Plane",
|
||||
Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991.
|
||||
- Donor: Stefan Aeberhard, stefan@coral.cs.jcu.edu.au
|
||||
- Date : May, 1992
|
||||
|
||||
3. Past Usage:
|
||||
- Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small
|
||||
Number of Samples and Design Method of Classifier on the Plane",
|
||||
Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991.
|
||||
- Aeberhard, S., Coomans, D, De Vel, O. "Comparisons of
|
||||
Classification Methods in High Dimensional Settings",
|
||||
submitted to Technometrics.
|
||||
- Aeberhard, S., Coomans, D, De Vel, O. "The Dangers of
|
||||
Bias in High Dimensional Settings", submitted to
|
||||
pattern Recognition.
|
||||
|
||||
4. Relevant Information:
|
||||
- This data was used by Hong and Young to illustrate the
|
||||
power of the optimal discriminant plane even in ill-posed
|
||||
settings. Applying the KNN method in the resulting plane
|
||||
gave 77% accuracy. However, these results are strongly
|
||||
biased (See Aeberhard's second ref. above, or email to
|
||||
stefan@coral.cs.jcu.edu.au). Results obtained by
|
||||
Aeberhard et al. are :
|
||||
RDA : 62.5%, KNN 53.1%, Opt. Disc. Plane 59.4%
|
||||
|
||||
The data described 3 types of pathological lung cancers.
|
||||
The Authors give no information on the individual
|
||||
variables nor on where the data was originally used.
|
||||
|
||||
- In the original data 4 values for the fifth attribute were -1.
|
||||
These values have been changed to ? (unknown). (*)
|
||||
- In the original data 1 value for the 39 attribute was 4. This
|
||||
value has been changed to ? (unknown). (*)
|
||||
|
||||
|
||||
5. Number of Instances: 32
|
||||
|
||||
6. Number of Attributes: 57 (1 class attribute, 56 predictive)
|
||||
|
||||
7. Attribute Information:
|
||||
|
||||
attribute 1 is the class label.
|
||||
|
||||
- All predictive attributes are nominal, taking on integer
|
||||
values 0-3
|
||||
|
||||
8. Missing Attribute Values: Attributes 5 and 39 (*)
|
||||
|
||||
9. Class Distribution:
|
||||
- 3 classes,
|
||||
1.) 9 observations
|
||||
2.) 13 "
|
||||
3.) 10 "
|
||||
|
Reference in New Issue
Block a user