Carry out a difference in means test using a small number of actual observations from the 2015 American Community Survey.
The first is to, essentially, create a question in the spirit of that from the Week 7 Assignment, but using different variables that you will select from the ACS. In particular, you will select an outcome variable (in my question, this was female work hours) and then a variable that separates the sample into two groups (in my question, this was women that have either 1 or 2 children.) You’ll select six observations at random from the 2015 ACS from PUMA 0608511, three from each group, and arrange the data in a table as in the Week 7 assignment. Finally, you’ll carry out a difference in means test using these six observations, and say whether there is a statistically significant difference in the outcome variable across the two groups.
The second main task is to review an econometric literature that relates to your question. You will search for scholarly, academic articles that have been published in peer-reviewed journals using EconLit, JSTOR and Google Scholar. You will identify three articles, provide one sentence commentary on each one, as well as an introductory and concluding sentence, to make a concise one-paragraph literature review. (This is only a mini-term paper, after all!)
In all the term paper will have five sections:
Introduction
Literature Review and Related Economic Theory
Data
Analysis
Conclusion
You will also have a References section, with three references, using appropriate bibliographic citation format.[1] So technically there are six sections, but the five above will contain writing, including a table and some equations.
In the introduction, you state the research question. The research question may also be the title of your paper but the intro elaborates on this question, and provides some motivation for it; tell the reader why it is important and worthy of their time. Shoot for one, five-sentence paragraph.
The literature review section should also be one paragraph, as mentioned above, and should discuss three related studies. Search 1.) the JSTOR database (visit http://library.calstate.edu/sanjose/databases/alphabetical?alpha=J (Links to an external site.)Links to an external site., to find the link to JSTOR; once there, limit your search to Econ journals under Advanced Search, and search using keywords in your area of interest; 2.) the EconLit database (to get there, change the end of the link above from “J” to “E”). You can search EconLit by keyword; try also to search by subject (SU) and JEL code: https://www.aeaweb.org/econlit/jelCodes.php (Links to an external site.)Links to an external site.; 3.) Often there is one “seminal” article that most researchers who are studying similar topics to you cite. If this is true for the area you are studying (and even if it isn’t) go to Google Scholar, search for a study, and click the link next to the ” button. This is a convenient way of doing a cited reference search, and as already mentioned the “ button is a convenient way to get perfectly formatted bibliographic citations. (Note: When you find articles on Google or through other web search, you will often need to access the university’s subscriptions. Our library subscribes to most scholarly journals.[2] Articles from the JSTOR and EconLit databases are safe choices to qualify as peer-reviewed academic or scholarly articles. If you know a journal you want to search, go to http://library.sjsu.edu/ (Links to an external site.)Links to an external site. and click Journal Titles. Enter the journal title. You can also use Google Scholar to link to articles in the SJSU library without having to enter in your Student ID and password; on Google Scholar, set up “library links” under “settings”.)
The Data section will briefly describe the American Community Survey (I know what this data set is, but I want to read that you, the author, can describe it at least reasonably well.) Then describe the variables you are using for analysis. You should provide a description of each variable you use, arranged in a Table of Variable Descriptions. You can then present the six observations you select for analysis in another table.
The analysis section describes the calculations for the difference in means test for the six observations, calculates the test statistic, and rejects or does not reject the null hypothesis of no difference in means. Show the equations involved in calculating the test statistic, and then present the results in a table like Table 1.1 from Mastering Metrics (yours will a lot smaller as you only have one outcome variable.) If the paper also analyzes a larger sample than just the minimum required sample size of six observations, the results of that analysis can also be presened here.
Finally, the Conclusion briefly summarizes the results and indicates whether the findings were consistent with previous studies. You should try to comment here on whether the correlations you study in this paper should be interpreted as any sort of causal effect, and perhaps how one could extend this analysis in the future.
Having examined the ACS_2015_Holian_Workbook.xlsx file from which the data used in Week 7 Assignment was taken (this is the “Small” Excel file,) you will have seen how I gathered the six observations used in that difference in means test. Study that Week 7 problem, as the largest part of your term paper assignment is basically to make up your own problem along the exact same lines.
An analysis of six observations in the minimum standard for earning a passing grade on the term paper. Students should also carry out the same test on a larger (potentially much larger) sample as well. Ideally you can carry out the test using the full sample of PUMA 0608511 using R Studio. It would be sufficient for this assignment if you just report the values of the two variables (the “outcome” or dependent variable and the binary group-selector variable) for six observations, and then carry out the test, but then getting a passing grade on the paper would also require also having a good literature review section. Whereas, if you attempt to analyze a sample larger than six, I will be more forgiving if the literature review isn’t very well done. In other words you can earn a sort of “extra credit” for attempting larger samples. I don’t want to overwhelm students who are struggling with R Studio, hence the requirement of only six observations, but ultimately there’s a lot of “big data” in the world, and it would be good if you can figure out how to use a computer to analyze it.
In fact, there are millions of observations in the ACS that you can access relatively easily. Create an account at https://usa.ipums.org/usa/ (Links to an external site.)Links to an external site. and download a CSV file. You can download data for the whole US, for multiple years, and just the variables you need. Email me and I can help you if you want to go this ambitious but totally doable route.
You will get points for producing and answering a question of the form described above, using six actual observations from the ACS. You will also get points for producing a question that relates to an interesting causal question from economics, as well as effectively reviewing relevant papers. You don’t have to discuss the previously published papers in detail, but instead must just say what they did and what they found in general terms, and how it relates to what you’re doing.
Finally, please upload your submission as either a PDF or DOC or DOCX file.
I wanna stress this is a mini term paper and its my last grade and I need a really good one, to ensure you have all the materials for the paper, I would like to give you my login information to my schools canvas and if you do well and submit this back to me tomorrow at this time I will tip you graciously and reward u for your hard work.
This assignment asks you to carry out a difference in means test using a small number of actual observations from the 2015 American Community Survey.
You should take the opportunity of completing this assignment to become familiar with the ACS data, as you will be using them in your term paper. The ACS is conducted every year by the US Census bureau. It has asked the same basic questions since its inception in the early 2000s, which themselves are continuation of the same basic questions asked on the decennial Census going back decades and even centuries. Thus it is arguably the most important of all the Census data products. There are two Excel files in the ACS folder on Canvas. The larger of the two contains data from individuals and households in a small area of Santa Clara county, called “San Jose City (South Central/Branham) & Cambrian Park PUMA, California”. This area is also known as “Public Use Microdata Area 0608511”, as its the 11th PUMA (11) in Santa Clara County (085) in the state of California (06)–the numbers here refer to the Federal Information Processing Systems code for Census geographies (these are called “FIPS” codes.). In 2015, the Census surveyed 403 households in PUMA 0608511, in which 1,042 individuals resided. The larger Excel file contains all the data from these individuals and households, as well as aggregate data (you can read more detail in the ReadMe worksheet of the larger Excel file.)
I urge you to try to understand the layout of the files and to in general become familiar with these data, as you’ll be using them in your term papers. When you’re ready to start thinking of variables you can use in your original term paper, I’ll point you to a third file, the Data Dictionary. Together three files: 1.) Data Dictionary, 2.) Questionnaire Form, and of course 3.) the actual ACS data from PUMA 0608511, comprise all the pieces of information a student should need to gain an understanding of the data and formula an original question to try to answer with the data in their term papers.
As you will see from examining the small Excel file, I slightly modified only one value to make the problem below easier to calculate by hand. You should be able to complete this assignment with a pen and paper only. You can check your work with a computer or calculator, but you will learn the most if you try it by hand.
The six households below contain one mom, one dad, and either one or two children. The table below indicates the Census-assigned household number (for reference purposes; it is not used in this assignment), the number of children in the household, a two-child household indicator variable, and the number of hours the mom works in a typical week.
household | Number kids | Two kids | Mom’s work hours |
16761 | 2 | 1 | 0 |
24438 | 2 | 1 | 45 |
41729 | 2 | 1 | 0 |
42802 | 1 | 0 | 35 |
63528 | 1 | 0 | 40 |
148848 | 1 | 0 | 0 |
Here is your task: Carry out a difference in means test of the null hypothesis that moms with one child work the same number of hours as moms with two kids; in other words, the null hypothesis is that average work hours is the same in both types of households. To do this, you’ll have to answer the following questions:
What is the average female work hours among one-child households? What is the average female work hours among two-child households? What is the difference in means?
What is the estimated standard error of the difference in means? (Hint: you have to calculate the variance of work hours for each group and plug these values, along with the number of observations in each group, in the formula in MM Chapter 1, footnote 17. You then have to take the square root of a large number; if you are doing this by hand, an approximate value is fine, and you can use the following hint to help you: the square root of 361 is 19 and the square root of 400 is 20. )
What is the value of the test statistic?
Is the difference in means statistically different from zero? In other words, do you reject the null hypothesis of equal work hours? (Hint: if the absolute value of the test statistic is greater than 1.96 we reject the null and say the difference in means is statistically different from zero, or “statistically significant”. Angrist and Pischke describe the critical value as “about two” but the number 1.96 is the precise value at which the cumulative standard normal distribution (this is a “bell shaped curve”) has 2.5% of the area under the curve and to the right. Thus, 95% of the area is under the curve between -1.96 and 1.96, and so 1.96 is known as the critical value for a test at the 5% significance level.)
Answer preview to carry out a difference in means test using a small number of actual observations from the 2015 American Community Survey.
APA
972 words