<br />
<b>Warning</b>:  Undefined variable $num in <b>/home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php</b> on line <b>126</b><br />
<br />
<b>Warning</b>:  Undefined variable $posts_num in <b>/home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php</b> on line <b>127</b><br />
<br />
<b>Warning</b>:  Undefined variable $num in <b>/home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php</b> on line <b>126</b><br />
<br />
<b>Warning</b>:  Undefined variable $posts_num in <b>/home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php</b> on line <b>127</b><br />
<br />
<b>Warning</b>:  Cannot modify header information - headers already sent by (output started at /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php:126) in <b>/home/shroutdo/public_html/courses/wp-includes/rest-api/class-wp-rest-server.php</b> on line <b>1902</b><br />
<br />
<b>Warning</b>:  Cannot modify header information - headers already sent by (output started at /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php:126) in <b>/home/shroutdo/public_html/courses/wp-includes/rest-api/class-wp-rest-server.php</b> on line <b>1902</b><br />
{"id":431,"date":"2018-10-15T19:14:03","date_gmt":"2018-10-15T19:14:03","guid":{"rendered":"http:\/\/courses.shroutdocs.org\/dcs104-fall2018\/?p=431"},"modified":"2018-11-11T18:44:08","modified_gmt":"2018-11-11T18:44:08","slug":"practice-exercise-3-quantification","status":"publish","type":"post","link":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/2018\/10\/15\/practice-exercise-3-quantification\/","title":{"rendered":"Practice Exercise #3 &#8211; Quantification"},"content":{"rendered":"<p>This exercise brings together what we have learned about quantification so far.\u00a0 You will be creating your own notebook from scratch, calling data, manipulating that data, and analyzing it.\u00a0 Please submit the notebook via this Lyceum link (<a href=\"https:\/\/lyceum.bates.edu\/mod\/assign\/view.php?id=215520\">section A<\/a>) (<a href=\"https:\/\/lyceum.bates.edu\/mod\/assign\/view.php?id=215521\">section B<\/a>) by the start of class on October 25th.<\/p>\n<p><strong>Note: Since you all are cleaning the data, you will not be able to start this lab until 5 PM on Tuesday the 16th.<\/strong><\/p>\n<p><strong>Steps:<\/strong><\/p>\n<ol>\n<li><strong>Set up<\/strong>\n<ol>\n<li>Create a new notebook. Name it DSC_104_PE3 (do not include your name &#8211; I will grade anonymously and then return your notebooks)<\/li>\n<li>Title your notebook. Make your first cell into a markdown cell (hit escape and then m) and then enter one hashtag &#8211; this makes the cell into a title cell.\u00a0 (For more on markdown syntax, <a href=\"https:\/\/github.com\/adam-p\/markdown-here\/wiki\/Markdown-Cheatsheet\">look here<\/a>) Write a placeholder title (you&#8217;ll come back at the end and change the title). Run the cell.<\/li>\n<li>Title your sections. Repeat the steps above for seven new cells but use two hashtags (this creates smaller title text).\u00a0 Name them: <strong>Introduction, Calling Data, Exploratory Statistics, Correlation, Categorical Variables, <\/strong>and<strong> Regression<\/strong>.<\/li>\n<li>Look over the <a href=\"http:\/\/courses.shroutdocs.org\/dcs104-fall2018\/2018\/10\/15\/oral-history-data-readme\/\">README I created for our data<\/a>.\u00a0 It describes each table, gives the URL that holds that table, and describes each column in the table.<\/li>\n<\/ol>\n<\/li>\n<li><strong>Calling Data<\/strong>\n<ol>\n<li>Use the function\u00a0<strong>library(&#8220;psych&#8221;)\u00a0<\/strong>to load the psych library of functions.<\/li>\n<li>In the\u00a0<strong>Calling Data<\/strong> section, name variables and import each .csv mentioned in the README.\u00a0 Use the\u00a0<strong>read.csv()<\/strong> function, which takes the path to the file (in this case, the URL of the file) and header = TRUE (so that we have column titles)<\/li>\n<\/ol>\n<\/li>\n<li><strong>Exploratory Statistics<\/strong>\n<ol>\n<li>In the\u00a0<strong>Exploratory Statistics<\/strong> section, create a markdown cell, and write up two questions that you think this data can answer.\u00a0 These should be about the relationship between different columns the data (i.e. What is the relationship between date of birth and country of origin? &#8211; but don&#8217;t use that one, since we&#8217;ve already covered it in practice.)\u00a0 The first one should ask a question about the relationship between columns from different tables. The second one should ask about how <em>three or more<\/em> columns relate to each other.<\/li>\n<li>Create another cell in the\u00a0<strong>Exploratory Statistics<\/strong> section and use\u00a0<strong>summary()<\/strong> and\u00a0<strong>class()\u00a0<\/strong>to gather information about each of the columns you mentioned in your question. Remember that you will want to identify a variable and a column in that variable using the syntax variable$column<\/li>\n<li>Create a final markdown cell in the\u00a0<strong>Exploratory Statistics<\/strong> section and describe any insights.<\/li>\n<\/ol>\n<\/li>\n<li><strong>Correlation<\/strong>\n<ol>\n<li>Look at your questions, and the tables they reference.\u00a0 Make a plan for merging those tables.\u00a0 Remember that the\u00a0<strong>merge()<\/strong> function takes in the two data frames that are being merged, as well as the column in each that contains the information that bridges both.\u00a0 Create a markdown cell in the\u00a0<strong>Correlation<\/strong> section and write up your merging plan.<\/li>\n<li>In a cell below the one you just created, merge the tables that you identified.\u00a0 Make sure to create new variables to hold those merged tables. Create a new cell to view your tables to make sure they are what you want them to be.<\/li>\n<li>Your next steps here will be determined by whether the questions you posed in step 3.1 require the comparison of columns with numerical data.\u00a0 If so, use those columns, if not, pick two other numerical columns from your merged table.<\/li>\n<li>First use\u00a0<strong>which[]<\/strong> to create a new data frame in a new variable that DOES NOT have any NAs in the columns you wish to compare.<\/li>\n<li>Then,\u00a0use as many cells as you need and the\u00a0<strong>cor()<\/strong> function to explore correlations between the columns you identified.<\/li>\n<li>Add a markdown cell after each code cell to explain what the resulting correlation tells you about your questions.<\/li>\n<\/ol>\n<\/li>\n<li><strong>Categorical Variables<\/strong>\n<ol>\n<li>Create a new code cell in the <strong>Categorical Variables<\/strong> section.\u00a0If your questions in step 3.1 required you to explore relationships between categorical columns, use the\u00a0<strong>table()<\/strong> function to first create variables that contain count tables. If your questions in step 3.1 do not require you to explore relationships between categorical columns, pick two categorical columns from your merged table.<\/li>\n<li>For each table, produce a pretty table using\u00a0<strong>ftable()\u00a0<\/strong>and a proportion table using\u00a0<strong>prop.table()<\/strong><\/li>\n<li>For each table, use <strong>chisq.test()<\/strong>\u00a0to determine whether the distribution of data in that table is likely due to random chance or not.<\/li>\n<li>Add a markdown cell after each table to explain what the resulting correlation tells you about your questions.<\/li>\n<\/ol>\n<\/li>\n<li><strong>Regression<\/strong>\n<ol>\n<li>Return to the question about the relationship between multiple columns that you posed in step 3.1.\u00a0 In a new markdown cell in the\u00a0<strong>Regression<\/strong> section, make a list of the columns you are interested in, along with whether they are categorical or numerical. Pick one of the numerical variables and identify it as your dependent variable.<\/li>\n<li>Subset your merged data so that each of the columns ONLY contains data that is NOT NA.<\/li>\n<li>Create a set of column titles and subset your data so that you have a new variable containing a new data frame that only has the columns you want.<\/li>\n<li>Create a markdown cell and speculate about how your independent variables might contribute to your dependent variable.<\/li>\n<li>Use\u00a0<strong>as.data.frame(dummy.code())<\/strong> to create a new variable and a new data frame that contains dummy codes for your categorical variable.\u00a0 If you have more than one categorical variable, you will need to create more than one dummy code data frame.<\/li>\n<li>Look at the table(s) you just created.\u00a0 Use the syntax data$new_column &lt;- dummy_data$one_column to add all but one of the dummy code columns to your data frame.<\/li>\n<li>Create a linear regression model using the\u00a0<strong>lm()<\/strong> function.\u00a0 Refer to the regression example for syntax.<\/li>\n<li>Use\u00a0<strong>summary()<\/strong> to assess which variables contribute to your model.<\/li>\n<li>Create the model again, dropping the variables that were not significant (starred)<\/li>\n<li>Create a markdown cell describing what this model tells you about the relationship between different columns.<\/li>\n<\/ol>\n<\/li>\n<li><strong>(Finally) Introduction and title<\/strong>\n<ol>\n<li>Return to the top of your notebook and write up a paragraph summarizing your findings.\u00a0 In this paragraph, reference AT LEAST three of the readings we have completed this semester.<\/li>\n<li>Add a catchy title.<\/li>\n<li>Download your notebook (file &gt; download as &gt; Notebook (.ipynb) ) and upload it.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>This exercise brings together what we have learned about quantification so far.\u00a0 You will be creating your own notebook from scratch, calling data, manipulating that data, and analyzing it.\u00a0 Please submit the notebook via this Lyceum link (section A) (section B) by the start of class on October 25th. Note: Since you all are cleaning&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-431","post","type-post","status-publish","format-standard","hentry","category-assignments"],"_links":{"self":[{"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/posts\/431","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/comments?post=431"}],"version-history":[{"count":5,"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/posts\/431\/revisions"}],"predecessor-version":[{"id":779,"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/posts\/431\/revisions\/779"}],"wp:attachment":[{"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/media?parent=431"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/categories?post=431"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/courses.shroutdocs.org\/dcs104-fall2018\/wp-json\/wp\/v2\/tags?post=431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}