Coverage for /home/martinb/.local/share/virtualenvs/camcops/lib/python3.6/site-packages/statsmodels/datasets/anes96/data.py : 63%

Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
1"""American National Election Survey 1996"""
2from numpy import log
4from statsmodels.datasets import utils as du
6__docformat__ = 'restructuredtext'
8COPYRIGHT = """This is public domain."""
9TITLE = __doc__
10SOURCE = """
11http://www.electionstudies.org/
13The American National Election Studies.
14"""
16DESCRSHORT = """This data is a subset of the American National Election Studies of 1996."""
18DESCRLONG = DESCRSHORT
20NOTE = """::
22 Number of observations - 944
23 Number of variables - 10
25 Variables name definitions::
27 popul - Census place population in 1000s
28 TVnews - Number of times per week that respondent watches TV news.
29 PID - Party identification of respondent.
30 0 - Strong Democrat
31 1 - Weak Democrat
32 2 - Independent-Democrat
33 3 - Independent-Indpendent
34 4 - Independent-Republican
35 5 - Weak Republican
36 6 - Strong Republican
37 age : Age of respondent.
38 educ - Education level of respondent
39 1 - 1-8 grades
40 2 - Some high school
41 3 - High school graduate
42 4 - Some college
43 5 - College degree
44 6 - Master's degree
45 7 - PhD
46 income - Income of household
47 1 - None or less than $2,999
48 2 - $3,000-$4,999
49 3 - $5,000-$6,999
50 4 - $7,000-$8,999
51 5 - $9,000-$9,999
52 6 - $10,000-$10,999
53 7 - $11,000-$11,999
54 8 - $12,000-$12,999
55 9 - $13,000-$13,999
56 10 - $14,000-$14.999
57 11 - $15,000-$16,999
58 12 - $17,000-$19,999
59 13 - $20,000-$21,999
60 14 - $22,000-$24,999
61 15 - $25,000-$29,999
62 16 - $30,000-$34,999
63 17 - $35,000-$39,999
64 18 - $40,000-$44,999
65 19 - $45,000-$49,999
66 20 - $50,000-$59,999
67 21 - $60,000-$74,999
68 22 - $75,000-89,999
69 23 - $90,000-$104,999
70 24 - $105,000 and over
71 vote - Expected vote
72 0 - Clinton
73 1 - Dole
74 The following 3 variables all take the values:
75 1 - Extremely liberal
76 2 - Liberal
77 3 - Slightly liberal
78 4 - Moderate
79 5 - Slightly conservative
80 6 - Conservative
81 7 - Extremely Conservative
82 selfLR - Respondent's self-reported political leanings from "Left"
83 to "Right".
84 ClinLR - Respondents impression of Bill Clinton's political
85 leanings from "Left" to "Right".
86 DoleLR - Respondents impression of Bob Dole's political leanings
87 from "Left" to "Right".
88 logpopul - log(popul + .1)
89"""
92def load_pandas():
93 """Load the anes96 data and returns a Dataset class.
95 Returns
96 -------
97 Dataset instance:
98 See DATASET_PROPOSAL.txt for more information.
99 """
100 data = _get_data()
101 return du.process_pandas(data, endog_idx=5, exog_idx=[10, 2, 6, 7, 8])
104def load(as_pandas=None):
105 """Load the anes96 data and returns a Dataset class.
107 Parameters
108 ----------
109 as_pandas : bool
110 Flag indicating whether to return pandas DataFrames and Series
111 or numpy recarrays and arrays. If True, returns pandas.
113 Returns
114 -------
115 Dataset instance:
116 See DATASET_PROPOSAL.txt for more information.
117 """
118 return du.as_numpy_dataset(load_pandas(), as_pandas=as_pandas)
121def _get_data():
122 data = du.load_csv(__file__, 'anes96.csv', sep=r'\s')
123 data = du.strip_column_names(data)
124 data['logpopul'] = log(data['popul'] + .1)
125 return data.astype(float)