top of page

Research in Network Analysis

Rebekah Manweiler, Erick Oduniyi, Prof. Jon Brumberg, Prof. Nicole Beckage
Fall 2017-Spring 2018
University of Kansas, Lawrence Kansas

This week I finished my presentation draft and went to the Undergraduate Research Symposium presentation workshop to get feedback. It was very helpful but, unfortunately we did not get a chance to share what we had already worked on. We mainly focused on logistics of the symposium and how to start working on your presentation and what goes into a good presentation. So, this weekend I will be giving my presentation to some of my friends and next week to Professor Brumberg for feedback and corrections. 

I also worked on getting more information about the nodes we removed to create the reduced networks. I was able to get the name of the node removed and its in and out degree, but I didn't get the nodes it was connected to at first to make sure that the maximum in and out degree was always 5 so I could preallocate the right amount of entries in the dataframe holding the data. I did find that the max was 5 and will be able to start running the code to get all the node's neighbors' names. I am worried that this code will take a while to run because even just getting the name and the in-degree and out-degree took quite a while. I think I was able to go through about 2,000 nodes in an hour and the voxforge has over 14,000 nodes, so I can only imagine how long this will take.

Updated: Jan 1, 2019

This week I finished creating the reduced networks and calculating their respective network measures. Below are the tables of results


Professor Brumberg and I have also decided that I need to get more information about the nodes that were removed to see what kind of patterns arise. This should be able to give us a better idea of why the measures change like they do. So I will work on getting information about the removed nodes, and begin working on the Monte Carlo simulations. I would like to have all of this done and in my presentation for the Undergraduate Research Symposium. (oh yea by the way, I applied to give a talk about this research at the KU Undergraduate Research Symposium and was accepted! So I will be presenting this research on April 28th!)

I will also be working on a first  draft of my presentation which I will first present to Brumberg and Erick on Monday, and then attend a URS presentation workshop to get more feedback. 

Well, last week when I stated that I had my measures calculated for the VoxForge network I may have jumped the gun a wee bit. I was in the process of running my code to calculate the measures and later that night when I was calculating the geodesic distance my computer ran out of memory ... again. This is the computer with 16 GB RAM so I am kind of at a loss as to what needs to happen. I have every other measure but this one, but no matter what I try, I don't have enough space to compute the statistic. 

What I decided to do was reduce the network by removing all words that occur in the data set less than two times. This dropped the VoxForge network size from 14591 unique words down to 9195, so I was able to recalculate all of my statistics (including the geodesic distance) just fine. But, now I can't compare these measures to my other networks because they are not reduced. So, I am currently in the process of reducing all of the CHILDES networks and recalculating all of my measures. 

I figured this wasn't a big deal because I know that I won't run into any memory issues with smaller networks, and it was something that Professor Brumberg and I had discussed earlier in the semester to see how the networks changed. I may do a similar reduction where I remove nodes with degree less than two as well and see what happens there. I've also been wondering if I should be reporting which nodes I remove during the reduction and information about them (their in-degree, neighborhood, ect.) to see if we can find anything interesting but, I have not written any code for that yet and will ask Professor Beckage if it is worth my time.

After I finish the analysis on the reduced CHILDES networks, I will code the Monte Carlo simulation which should not take very much time. In fact, writing code for the reduction has given me a really good idea of what needs to be done for the simulation and it should be very simple to write.

1
2
bottom of page