Code to check if packages are installed, run if not


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Hello all,

We can certainly run each package at the start of each notebook, but if you are comfortable, you can also use the following code:

# List your packages here (this checks to see if tidyRSS and stringr are installed.
packages <- c("tidyRSS", "stringr")

# Run this without making changes
package.check <- lapply(packages, FUN = function(x) {
  if (!require(x, character.only = TRUE)) {
    install.packages(x, dependencies = TRUE)
    library(x, character.only = TRUE)
  }
})

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Class 11.1

Scaffolding # 5 – Final Project Idea


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This is your opportunity to select, and set the parameters for, your final project.  Work through this in the following order:

  1. Review the comments you received from me on your Scaffolding # 4 assignment
  2. Look over the data that we have been working with (consult the README and also look at the data tables themselves)
  3. Think about the kind of work that will be required to complete the final project (full project description here)
  4. In light of the restrictions of the data, your time, and your interests, pick one of the ideas you submitted as part of Scaffolding # 4
  5. Revise your question, and your discussion of the data cleaning and additional research you will have to undertake.
  6. Submit your revised idea (section A)(section B)

Practice Exercise # 6 – Data Visualization


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This exercise brings together what we have learned about data visualization.  You will be creating your own notebook from scratch, calling data, manipulating that data, and visualizing it.  Please submit the  notebook create via this Lyceum link (section A) (section B) by the start of class on November 27th

Steps:

Review Best Practices. As a class, we came up with the following best-practices for data visualization.  Read over them, and make sure that you have a good sense of what they would mean for your own visualizations:

  • Acknowledge the source of your data
  • Understand the story behind your data
  • Understand what you’re looking for
  • Make sure the visualization is clear
  • Consider your audience
  • Good labels, axis titles, colors
  • Use trend lines appropriately
  • Make sure that your data is sufficient and representative data
  • Makes sure your visualization is accessible
  • Make sure your visualization is interactive (where appropriate)
  • Makes sure that you use frameworks that your audience is familiar with

Set up

  1. Create a new notebook.  Name it PE6 (do not include your name).  You will be downloading, zipping, and uploading this folder.
  2. Title your notebook. Make your first cell into a markdown cell (hit escape and then m) and then enter one hashtag – this makes the cell into a title cell.  Write a placeholder title (you’ll come back at the end and change the title). Run the cell.
  3. Title your sections. Repeat the steps above for seven new cells but use two hashtags (this creates smaller title text).  Name them: AnalysisPackages, Calling Data, First Viz, Second Viz

Packages

  1. Load the package “ggplot2” by running library(“ggplot2).  YOU DO NOT NEED TO RUN INSTALL PACKAGES.

Calling Data

  1. Call in each of the tables we have been working with: person, work, family, factory.  Save each in a different variable.

First Viz.

  1. Take some time to re-familiarize yourself with this data.  Look at the README if you need to.
  2. Take some time to familiarize yourself with the kinds of visualizations we created in the Challenges of Data Viz notebook
  3. Come up with ONE question about the relationship between two numerical columns and ONE categorical column from TWO different tables.
  4. Merge those tables.
  5. Remove the NAs from the columns you are interested in.
  6. Consider the best practices for dataviz outlined above, and make a visualization, using one of the methods we used in the Challenges of Data Viz notebook.  You will be plotting the two numerical columns, and color-coding according to the categorical column.
  7. Title your first viz section (something other than first viz)
  8. Write a paragraph outlining what the visualization is meant to show, and how you designed it in keeping with best practices in your analysis section.

Second Viz

  1. Look at the additional types of visualizations available in ggplot2 using this ggplot2 cheat sheet
  2. Pick a visualization that we have not used already.
  3. Using different variables, repeat the steps above.

Analysis

  1. Write an introduction that explains any insights that you gained from your data, and any new questions that your visualizations raised.
  2. Illustrate your insights using your images.
  3. Create a witty and/or informative title for your notebook.

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Class 10.2


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Challenges of Data Visualization


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Class 10.1

Final Project


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Your final project will be a public-facing website.  You should build it on a subdomain different from that which you use for you class blog.  You should submit your website to me, via e-mail by the end of finals week.  Your website should include the following components:

  • The questions you sought to answer about your data.
  • Discussion of how the data was produced (both the collaborative work done by the class, and the cleaning and additional research that you undertook) (~250 words + at least 3 citations)
  • Any context that you think your viewer will need to understand the data (~250 words + at least 1 citation)
  • Ethical concerns you grappled with and addressed while working with the data (~250 words + at least 3 citations)
  • Analysis – this should include the code that you wrote in the form of an embedded notebook, and a discussion of why the methods (Code + ~250 words + at least 3 citations)

Practice Exercise # 5 – Networks


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This exercise brings together what we have learned about networks so far, and builds on some of the skills from Practice Exercise # 3.  You will be creating your own notebook from scratch, calling data, manipulating that data, and analyzing it.  Please submit the zipped folder containing the notebook and the 7 .png images you create via this Lyceum link (section A) (section B) by the start of class on November 13th

Steps:

Set up

  1. Create a new folder.  Name it PE5 (do not include your name).  You will be downloading, zipping, and uploading this folder.
  2. Create a new notebook in this new folder. Name it DSC_104_PE5 (do not include your name – I will grade anonymously and then return your notebooks)
  3. Title your notebook. Make your first cell into a markdown cell (hit escape and then m) and then enter one hashtag – this makes the cell into a title cell.  Write a placeholder title (you’ll come back at the end and change the title). Run the cell.
  4. Title your sections. Repeat the steps above for seven new cells but use two hashtags (this creates smaller title text).  Name them: Analysis, Packages, Calling Data, Constructing Networks, Analyzing Networks, Visualizing Networks, Text networks

Installing Packages

  1. Install the package “igraph”
  2. Load the package “igraph”

Calling Data

  1. Download the .csv file people_family.csv, and enter a 1 in each cell where the last name of the person in the row matches the last name in the column.  (This assumes that families are related by both marriage and birth)
  2. Save that file and upload it to your notebook environment
  3. Load people_family.csv as a matrix
  4. Load people_work.csv as a matrix (use the link from the Network Analysis notebook)
  5. Load the attributes data file

Constructing Networks

  1. Convert each matrix into a matrix that shows connections between people (use the code from the Network Analysis notebook)
  2. Create a new variable, and add the matrixes together (just use a +)
  3. Set the diagonals of this new matrix to NA
  4. Create a network graph object based on this new matrix
  5. Set the diagonals of your people_work matrix to NA
  6. Create a network graph object based on your people_work matrix
  7. Set the diagonals of your people_family matrix to NA
  8. Create a network graph object based on your people_family matrix

Analyzing Networks

  1. Calculate the betweenness for each network. Make sure that you assign names to your statistics using V().
  2. Make note of differences among your three networks
  3. Calculate the centrality for each network.  Make sure that you assign names to your statistics using V().
  4. Make note of differences among your three networks.

Visualizing Networks

  1. Visualize each network (use the code from the Network Analysis notebook).  Save each into a different .png file
  2. Look at each network, make note of differences
  3. Add color according to country of origin. Instead of using names for colors, go to http://colorbrewer2.org/, set the number of data classes to 3 (one for each country – USA, Canada and Greece) and set the nature of your data to qualitative.  Pick one of the color schemes that you like, and enter the hex code (it will look something like #7fc97f) in place of “red”, “blue” and “green” in the Network Analysis notebook.
  4. We are going to add one more dimension, which is strength of connection.  Create a new variable that will capture the strength of the relationship between two nodes (in network parlance, this is known as edge.weight).  Use the code below, but put the name of the graph object that holds the network that shows both family and work connections:
    YOURNEWVARIABLE <- get.edge.attribute(GRAPH-REPLACE-THIS-TEXT, "weight)
  5. Now, run the code that makes your network visualization again, but replace
    edge.width = 0.25

    with

    edge.width = YOURNEWVARIABLE
  6. Make note of any changes.

Cleaning your notebook

  1. Before you submit your notebook, go back and delete cells with extraneous information or attempts that did not work.
  2. If you feel like you need to explain decisions or flag problem points, do so now.

Analysis 

  1. Write an introduction that explains any insights that you gained from your data, and any new questions that your visualizations raised.  Illustrate your insights using your images.  You can insert images into your notebook using the following syntax (replace image.png with the file name of the image you want to insert):
    ![the title for your image](image.png)
  2. Create a witty and/or informative title for your notebook.

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Class 9.2