Well, last week when I stated that I had my measures calculated for the VoxForge network I may have jumped the gun a wee bit. I was in the process of running my code to calculate the measures and later that night when I was calculating the geodesic distance my computer ran out of memory ... again. This is the computer with 16 GB RAM so I am kind of at a loss as to what needs to happen. I have every other measure but this one, but no matter what I try, I don't have enough space to compute the statistic.
What I decided to do was reduce the network by removing all words that occur in the data set less than two times. This dropped the VoxForge network size from 14591 unique words down to 9195, so I was able to recalculate all of my statistics (including the geodesic distance) just fine. But, now I can't compare these measures to my other networks because they are not reduced. So, I am currently in the process of reducing all of the CHILDES networks and recalculating all of my measures.
I figured this wasn't a big deal because I know that I won't run into any memory issues with smaller networks, and it was something that Professor Brumberg and I had discussed earlier in the semester to see how the networks changed. I may do a similar reduction where I remove nodes with degree less than two as well and see what happens there. I've also been wondering if I should be reporting which nodes I remove during the reduction and information about them (their in-degree, neighborhood, ect.) to see if we can find anything interesting but, I have not written any code for that yet and will ask Professor Beckage if it is worth my time.
After I finish the analysis on the reduced CHILDES networks, I will code the Monte Carlo simulation which should not take very much time. In fact, writing code for the reduction has given me a really good idea of what needs to be done for the simulation and it should be very simple to write.
Comments