| HOME | HOW IT WORKS | RESULTS | GALLERY | CONTACT | |
history flow: results What can we learn from history flow? Here are some of the patterns revealed by the visualization. | |
See the paper we published on our history flow study: Studying Cooperation and Conflict between Authors with history flow Visualizations | |
| A
simple example Here's an example of a simple page with just a few edits: the first eight versions of the Wikipedia entry for IBM. The page has three named authors (listed at left), including a script which changed some formatting. Each author is given a unique color. Several anonymous authors also made contributions; their insertions are shown in shades of gray. The green regions show the contributions of the initial author, Peter Winnberg, many of which persist throughout the versions shown. Text that persists over time is darkened to indicate its age, so at the right side of the diagram Peter Winnberg's contributions have changed from bright green to dark green. ![]() Visualizing every saved version of the page on "IBM", with versions spaced equally. Hues indicate authorship; brightness indicates age of text, with brighter colors being more recent. | |
| Spacing
by date This history flow diagram shows the same eight versions of the IBM page, but this time versions are spaced horizontally according to revision date, so that changes that took place very rapidly show up right next to each other. This view shows that most of the activity happened in February and March of 2002. ![]() Visualizing every saved version of the page on "IBM", with versions spaced by revision date. Hues indicate authorship; brightness indicates age of text, with brighter colors being more recent. | |
| Vandalism As publicly editable sites, Wikis are vulnerable to vandalism. We've examined many pages on Wikipedia that treat controversial topics, and have discovered that most have, in fact, been vandalized at some point in their history. But we've also found that vandalism is usually repaired extremely quickly--so quickly that most users will never see its effects. The pictures below tell the story. ![]() Visualizing every saved version of the page on "abortion", with each version getting equal space. The vertical black interruptions indicate times when a visitor has deleted most of the page. ![]() Same page on "abortion", but here horizontal spacing corresponds to time, so that rapid-fire changes show up almost on top of each other. Because vandalism is repaired so quickly, it does not show up in this view of the visualization | |
| Authorship When you visit and contribute to a Wiki site you may do so either as a registered user or as an anonymous user. history flow shows that some Wiki pages are almost exclusively crafted by registered authors whereas other pages are mainly created by anonymous visitors to the site. ![]() Wikipedia page on "evolution"; each color represents the contribution of a partibular registered author. White and gray represent the contributions of anonymous authors. ![]() Wikipedia page on "Microsoft"; we see many more anonymous contributions (white and gray) than in the "evolution" page above. It's interesting to compare this entry with the entry for IBM! | |
| Growth Visualizations in history flow have indicated that some pages in Wikipedia tend to grow gradually over time whereas other pages grow in bursts. (And most seem to grow continually rather than stabilizing at a fixed size.) ![]() Wikipedia page on "Brazil"; unlike the page on "Microsoft", which shows a gradual growth over time, this page had a sudden growth of over 100% of its size. | |
| Persistence Visualizations in history flow indicate that the text on some of the Wikipedia pages persists for a long time. Some of the text contributed by certain authors survives many revisions. ![]() Wikipedia page on "Islam"; the highlighted portions of the visualization (light yellow) represent text contributed by a single author and its persistence over time. | |
| Future
directions The results on this page suggest several interesting lines of research. Understanding the frequency and timing of vandalism has clear importance for maintainers of wiki sites. And analyzing the overall stability--in size and content--is important in assessing the reliability of group-authored web sites. It would be good to have a solid understanding of the relationship between various factors highlighted here. For instance, how does anonymity affect the likelihood of vandalism? Are page sections that survive many edits more likely to be high quality? Although the visualizations above are suggestive, any potential patterns need to be verified through statistical analysis. We also plan to compare wikipedia with other wiki-style sites to learn how different design decisions affect authoring patterns. Finally, it would be fascinating to use this technique to study the evolution of source code. | |
(c) copyright 2003 IBM. | |