mirror of
https://github.com/Doctorado-ML/Stree_datasets.git
synced 2025-08-18 08:56:01 +00:00
Commit Inicial
This commit is contained in:
BIN
data/tanveer/titanic/datos_orixinais/titanic.tar.gz
Executable file
BIN
data/tanveer/titanic/datos_orixinais/titanic.tar.gz
Executable file
Binary file not shown.
2201
data/tanveer/titanic/datos_orixinais/titanic/Dataset.data
Executable file
2201
data/tanveer/titanic/datos_orixinais/titanic/Dataset.data
Executable file
File diff suppressed because it is too large
Load Diff
BIN
data/tanveer/titanic/datos_orixinais/titanic/Dataset.data.gz
Executable file
BIN
data/tanveer/titanic/datos_orixinais/titanic/Dataset.data.gz
Executable file
Binary file not shown.
57
data/tanveer/titanic/datos_orixinais/titanic/Source/Notes
Executable file
57
data/tanveer/titanic/datos_orixinais/titanic/Source/Notes
Executable file
@@ -0,0 +1,57 @@
|
||||
TITANIC DATASET
|
||||
|
||||
Converted for use in DELVE by Radford Neal, June 1996.
|
||||
Originally compiled by Robert Dawson, 1995.
|
||||
|
||||
|
||||
The titanic dataset gives the values of four categorical attributes
|
||||
for each of the 2201 people on board the Titanic when it struck an
|
||||
iceberg and sank. The attributes are social class (first class,
|
||||
second class, third class, crewmember), age (adult or child), sex, and
|
||||
whether or not the person survived.
|
||||
|
||||
The question of interest for this natural dataset is how survival
|
||||
relates to the other attributes. There is obviously no practical
|
||||
need to predict survival, so the real interest is in interpretation,
|
||||
but success at prediction would appear to be closely related to
|
||||
the discovery of interesting features of the relationship. Note
|
||||
that there are only sixteen possible combinations of input attributes
|
||||
for this prediction task, so the interesting behaviour will be that
|
||||
with small training sets.
|
||||
|
||||
|
||||
Source from which the data was obtained.
|
||||
|
||||
The original source files are titanic.doc and titanic.dat, which were
|
||||
obtained from the data archive of the on-line Journal of Statistics
|
||||
Education, whose home page on the Web is at URL
|
||||
|
||||
http://www2.ncsu.edu/ncsu/pams/stat/info/jse/homepage.html
|
||||
|
||||
Carriage returns at the end of the lines were deleted, as was a line
|
||||
containing a period at the end of each file. Other than this, the
|
||||
titanic.doc and titanic.dat files are as obtained from this source.
|
||||
|
||||
The dataset was compiled by Robert J. MacG. Dawson, and discussed by
|
||||
him in the on-line article 'The "Unusual Episode" Data Revisited',
|
||||
Journal of Statistics Education, vol. 3, no. 3 (1995), available via
|
||||
the URL above.
|
||||
|
||||
|
||||
Notes on aspects of the data.
|
||||
|
||||
As discussed in the article, the dataset was reconstructed from
|
||||
sources that were not completely clear, so there are undoubtably some
|
||||
errors.
|
||||
|
||||
The cases in titanic.dat are clearly in a non-informative order,
|
||||
grouped by identical attribute patterns. This has been retained for
|
||||
the DELVE dataset file.
|
||||
|
||||
The representation of attributes has been changed to be more mnemonic.
|
||||
|
||||
Prior information regarding the significance of social class is
|
||||
somewhat debatable. In the standard prior, I have considered status
|
||||
to be an ordinal variable in which crewmembers come after third class
|
||||
passengers. Perhaps crewmembers should be considered to be outside
|
||||
this class ordering altogether, but that is not convenient.
|
2201
data/tanveer/titanic/datos_orixinais/titanic/Source/titanic.dat
Executable file
2201
data/tanveer/titanic/datos_orixinais/titanic/Source/titanic.dat
Executable file
File diff suppressed because it is too large
Load Diff
68
data/tanveer/titanic/datos_orixinais/titanic/Source/titanic.doc
Executable file
68
data/tanveer/titanic/datos_orixinais/titanic/Source/titanic.doc
Executable file
@@ -0,0 +1,68 @@
|
||||
|
||||
NAME: Population at Risk and Death Rates for an Unusual Episode
|
||||
TYPE: Complete record for all of population at risk
|
||||
SIZE: 2201 observations, 4 variables
|
||||
|
||||
DESCRIPTIVE ABSTRACT:
|
||||
For each person on board the fatal maiden voyage of the ocean liner
|
||||
Titanic, this dataset records sex, age [adult/child], economic status
|
||||
[first/second/third class, or crew] and whether or not that person
|
||||
survived.
|
||||
|
||||
SOURCE:
|
||||
"Report on the Loss of the `Titanic' (S.S.)" (1990), _British Board of
|
||||
Trade Inquiry Report_ (reprint), Gloucester, UK: Allan Sutton
|
||||
Publishing.
|
||||
|
||||
VARIABLE DESCRIPTIONS:
|
||||
Column
|
||||
1 Class (0 = crew, 1 = first, 2 = second, 3 = third)
|
||||
10 Age (1 = adult, 0 = child)
|
||||
19 Sex (1 = male, 0 = female)
|
||||
28 Survived (1 = yes, 0 = no)
|
||||
|
||||
Values are aligned and delimited by blanks. There are no missing
|
||||
values.
|
||||
|
||||
SPECIAL NOTES:
|
||||
There is not complete agreement among primary sources as to the exact
|
||||
numbers on board, rescued, or lost.
|
||||
|
||||
STORY BEHIND THE DATA:
|
||||
The sinking of the Titanic is a famous event, and new books are still
|
||||
being published about it. Many well-known facts--from the proportions
|
||||
of first-class passengers to the "women and children first" policy, and
|
||||
the fact that that policy was not entirely successful in saving the
|
||||
women and children in the third class--are reflected in the survival
|
||||
rates for various classes of passenger. These data were originally
|
||||
collected by the British Board of Trade in their investigation of the
|
||||
sinking.
|
||||
|
||||
PEDAGOGICAL NOTES:
|
||||
These data make an interesting exercise if given to a class without
|
||||
their context, which the students must attempt to discover. The
|
||||
instructor will probably want to answer questions from the class,
|
||||
"Twenty Questions" style.
|
||||
|
||||
There is a similar set of data circulating without any detailed
|
||||
explanation or compiler's name attached, under the same title, which
|
||||
omits the crew (and does not agree with any of the primary sources that
|
||||
I was able to find.) Credit for the original idea goes to the
|
||||
originator of that exercise: my version is merely an attempt to
|
||||
provide a more complete context.
|
||||
|
||||
Additional information about these data can be found in the "Datasets
|
||||
and Stories" article "The `Unusual Episode' Data Revisited" in the
|
||||
_Journal of Statistics Education_ (Dawson 1995). Send the message
|
||||
|
||||
send jse/v3n3/datasets.dawson
|
||||
|
||||
to the address archive@jse.stat.ncsu.edu
|
||||
|
||||
SUBMITTED BY:
|
||||
Robert J. MacG. Dawson
|
||||
Department of Mathematics and Computing Science
|
||||
Saint Mary's University
|
||||
Halifax, Nova Scotia B3H 3C3
|
||||
CANADA
|
||||
rdawson@husky1.stmarys.ca
|
6
data/tanveer/titanic/datos_orixinais/titanic/Summary
Executable file
6
data/tanveer/titanic/datos_orixinais/titanic/Summary
Executable file
@@ -0,0 +1,6 @@
|
||||
The titanic dataset gives the values of four categorical attributes
|
||||
for each of the 2201 people on board the Titanic when it struck an
|
||||
iceberg and sank. The attributes are social class (first class,
|
||||
second class, third class, or crewmember), age (adult or child), sex,
|
||||
and whether or not the person survived. The question of interest is
|
||||
considered to be how survival relates to the other attributes.
|
BIN
data/tanveer/titanic/datos_orixinais/titanic/survived/Prototask.data.gz
Executable file
BIN
data/tanveer/titanic/datos_orixinais/titanic/survived/Prototask.data.gz
Executable file
Binary file not shown.
2201
data/tanveer/titanic/datos_orixinais/titanic/survived/Random-order
Executable file
2201
data/tanveer/titanic/datos_orixinais/titanic/survived/Random-order
Executable file
File diff suppressed because it is too large
Load Diff
4
data/tanveer/titanic/datos_orixinais/titanic/survived/std.prior
Executable file
4
data/tanveer/titanic/datos_orixinais/titanic/survived/std.prior
Executable file
@@ -0,0 +1,4 @@
|
||||
1 NLMH ordinal
|
||||
2 NLMH binary
|
||||
3 NLMH binary
|
||||
4 NLMH binary passive=no
|
Reference in New Issue
Block a user