Finding data in the long-tail

Biosystems Analytics

blog_1_fig_1Scientists are increasingly examining the most comprehensive catalogue of datasets for any particular question.  Making sure you can find as much of the data relevant to a particular problem thus begins to loom as a large issue.   Although institutional repositories (such as NCBI, Dryad, Figshare etc.) are great at storing the final published versions of the data sets, some early and smaller-scale research data can get lost in the “long-tail“.   Anne Thessen has a great post over on her blog on the Data Detektiv, on how to locate and keep track of such “dark data”:

Finding relevant data, especially if the needed data are dark, can be a difficult and lengthy task. … Was there a way to discover data based on events earlier in the research workflow? After some thought, I realized that databases and lists of awards made by funding agencies were an…

View original post 21 more words

“Hey Peter, I’m gonna need you to come in all the time. Does all the time work for you?”

Fans of the 1999 movie Office Space will have surely noticed the recent front page above-the-fold article in last Sunday’s New York Times on working conditions at Amazon has stirred up a lot of interest, including a response from Jeff Bezos himself (who states, somewhat incongruously, that he has “zero tolerance for lack of empathy”).   One of the reasons that this article gained so much traction (as opposed to the steady drip-drip of articles about work conditions at it’s warehouses), is that it focused on the changing work environment of many upper-middle-class white-collar workers all over the country (which is, let’s face it, the readership of the New York Times).   The rise of the 24/7 work culture, constant e-mail, tethering to our smartphones, and now the micro-surveillance of work “productivity” has now passed some kind of rubicon, where more and more people are increasingly asking, like the Peter Gibbons character in the film: what’s the whole point of it?

Read More »