The incredible growth and presence of social technologies in all aspects of life translates into large data sets that help researchers understand human behavior, social system design, and the development of digital culture. However, as John Markoff points out in a recent NYT article, most of these data are “forbidden to researchers.”
Among the reasons for this lock down are cost, privacy, and industrial secrecy. Indeed, it is difficult to put together and maintaining these data sets from social computing services in a way that complies with those services’ privacy policies, protects competitiveness, and does not drain strained resources.
Despite these challenges, there have been several efforts by different organizations to share data with researchers. For example, Reddit, StackExchange, Yelp, and Wikipedia, have put the time and effort to release data sets for the research community.
During the Microsoft Research Faculty Summit last week, FUSE Labs announced to the participants of the Social Media Workshop that it will be releasing log or instrumentation data from Socl, a website that lets people share their interests using search. Despite having been unveiled only a few months ago, Socl already has several hundred thousand users who have contributed a large number of aesthetically pleasing posts. We hope that access to this data will help researchers investigate the birth of an online community, and that it can also help the research community engage in a conversation about open data from social media systems.
If you have ideas on how to use Socl data for your research, please get in touch at firstname.lastname@example.org.