Thesis/Introduction

From Researchwiki

Jump to: navigation, search

The Internet has seen the astonishing growth of blogging, RSS, and podcasting, as forms of user-generated content. Blogs are replacing traditional news sites, and online discussion and interaction, through the popularity of sites such as digg and Slashdot are changing the way we find, judge and trust information. Wikis have continued this trend in user-built interactive information universes. Wikipedia, a free, open-content (public user created) encyclopaedia has popularised the concept of a wiki, with many projects adopting MediaWiki (shown in figure 1.1), the software used by Wikimedia for Wikipedia, or creating their own custom wiki systems.

Figure 1.1 "A screenshot of the editing page and preview of the default Main Page of a MediaWiki 1.7.1 installation."
Figure 1.1 "A screenshot of the editing page and preview of the default Main Page of a MediaWiki 1.7.1 installation."

Examples of important wikis are

  • Wikipedia and related projects (Wiktionary, Wikibooks, Wikicommons, Wikisource, Wikinews etc.)
  • c2.com (Ward's Wiki) - The first wiki, hosting the Portland Pattern Repository and material on Extreme Programming
  • Georgia Institute of Technology (CoWeb) - used by students of classes at Georgia Tech.
  • New York Times Digital - Used by project teams within the company
  • Motorola Systems-on-Chip Design Technology (TWiki) used for project management, communication, documentation, article writing, and group scheduling
  • Encarta Encyclopaedia, who introduced a wiki-like extension to their encyclopaedia

The Internet community, as an intellectual group, promotes distribution of information made as correct as possible by ones' ability, but care must be taken, as that level of ability may be low. Google search and PageRank help indicate the best pages, improving the apparent average quality, as writers link to sources they trust. When reading blogs, the standard is set by the best blogs, because nobody reads the average blog. .

People just produce whatever they want; the good stuff spreads, and the bad gets ignored. And in both cases, feedback from the audience improves the best work.

Open source and wikis break the traditional model of publishing; rather than authors publishing in their own spaces, and competing for an audience, authors contribute to the same space, attempting to improve the collective writing of the community. To paraphrase ; people contribute what they like, the good stuff stays, the bad gets removed.

Herein lies the major criticism of wikis in general, that quality can not "evolve" from this process. There is no guarantee of the accuracy of content, and there is no formal process of validation, by which content is said to be correct. Rather, a continual process is used, where content is constantly being validated and edited, and accuracy is transitory.

This research seeks to understand these criticisms, and discern how to improve the content that wiki and open-content communities have worked to create, making it more authoritative and widely usable in the academic world.

Definition of Selected Terms

Apache a popular open source web (HTTP) server.

blog a publicly accessible journal published online using specialised software, where entries are presented in reverse chronological order, usually written by a single author, or a small group. Most blogging software supports RSS, which allows readers to subscribe to a blog, and automatically receive updates.

CamelCaps a method of joining words together by capitalising each word before removing spaces between words. Commonly used by programmers, and in some wiki systems.

CGI Common Gateway Interface. A technology used by web servers to allow the server to communicate with an external application, allowing the application to respond to a user request.

click-stream a list of links or web pages a user follows while browsing the Internet

CSS Cascading Style Sheets. A document used to store formatting information for a HTML page. CSS facilitates the separation of content and formatting.

CSU Charles Sturt University.

digg a social bookmarking website presenting science and technology news. Users may vote news items up and down, varying the popularity of the item.

extreme programming an incremental software development methodology, emphasising the need for the software and developers to be adaptable, to be able to respond quickly to changes during the development lifetime.

gift culture a community where goods and services are given away in exchange for favours or respect.

GNU Recursive acronym for GNU's Not Unix. A free operating system and related tools and applications.

group-think the act of conforming to the shared opinion of a community, without a significant attempt to consider alternatives

Hacker an enthusiast. Specifically, in this document, a software developer. N.B. While this is the original meaning of the word, today it is often corrupted to mean someone who breaks security.

hook a software construct whereby a module may request to be called to handle an event

HTML Hyper-Text Mark-up Language. A document format used for writing and formatting web pages.

HyperCard a powerful and flexible programming environment written by Apple Computers.

IP address Internet Protocol address. A unique number assigned to all devices (typically computers) connected to the Internet. This number facilitates the forwarding of information on the Internet to the correct destination.

ISP Internet Service Provider. An organisation who provide Internet access.

JavaScript a scripting language commonly used for performing simple tasks within a web browser.

MIME type Multipurpose Internet Mail Extensions type. A part of an Internet standard used for specifying information (typically file) formats. The MIME type specifies a content type and subtype. The major types are application, audio, image, message, model, multipart, text, and video.

MediaWiki database-backed wiki software developed closely with Wikipedia and its community. Probably the most popular and recognisable wiki engine.

MySQL a popular open source database management system.

namespace in MediaWiki a namespace is an abstract virtual container allowing articles to be grouped such that articles from different namespaces with the same name do not conflict. In MediaWiki namespaces are for separating different types of content, such as help content, personal content, templates, and images. Namespaces are by a phrase placed before a colon in the full identifier of an article, eg. Help:FAQ.

NPOV Neutral Point of View. A Wikipedia policy stating that 'all articles must be written from a neutral point of view, that is, they must represent all significant views fairly and without bias' (Wikipedia Contributors 2006k).

open-content works not produced for profit and released for distribution and improvement by others at no cost. Such works are often written collaboratively.

open source open-content source code publishing, where the source materials used in generating the end product are also released. Most commonly refers to open source software, where the source code is released along with the finished product.

PageRank A method for determine a numerical approximation of the reputation of a web page, used by Google search for raking search results.

Perl the specification for a level interpreted programming language sharing features with C, and AWK well suited to processing text files.

perl a software implementation of the Perl specification.

PHP PHP: Hypertext Preprocessor. A popular open-source programming language commonly used for writing web applications.

PIM Personal Information Manager. Software that combines features such as notes or todos, calendars or communications (email/instant messaging/telephone/fax), as an organisation aid to the user.

podcast a collection of files (typically audio or video) distributed on the Internet using the enclosures feature of RSS to "push" the files out to subscribers. Podcatcher or aggregator software allows users to subscribe to RSS "feeds" which signal the software to download new files as they become available.

RCS Revision Control System. Software used to manage multiple versions of files (such as documentation or program source code). Such a system typically allows a user to review or revert to previous versions, as well as track changes and related meta-data (contributing user, date etc.)

RSS Really Simple Syndication (most common meaning). A specially formatted file published on the Internet, containing a series of entries. These textual entries usually contain a summary of available content, such as blog items, news items, or podcast items. End user software is used to automatically collect up-to-date versions of these files, and present the contained summaries to the user. An "enclosures" feature allows the inclusion of a file (typically audio or video) with each entry.

seeding creating the initial set of pages in a wiki, providing an initial structure and guidelines for users.

Slashdot A popular technology news site, with a large and active community. The Slash software used on the site contains a moderation system used to rate and filter the often hundreds of comments posted in reply to each news item.

social bookmarking web based collaborative repository of Internet bookmarks (URLs or links). Such repositories typically support some sort of rating or commenting mechanism to help visitors find and manage bookmarks.

Special Page a set of dynamic pages in MediaWiki, facilitating functions such as deleting pages, searching, moving pages, logging in and out, and various administration functions.

spider or crawler. Automated software typically used by search engines that and downloads web pages, using links in pages downloaded to find new pages to download.

SQL Structured Query Language. A programming language designed to provide an interface to database management systems.

user sub-pages Sub-pages are a feature of the MediaWiki software where pages may be created logically "beneath" another page. For example, a page titled Animals may have a sub-page called Animals/Dogs. A user sub-page is a sub-page beneath a users personal page in the "User" namespace.

wiki 1.a website allowing collaborative authoring, where users may add edit and remove text (or possibly other media) in a single central repository of "pages". 2. software that facilitates such functions.

Wikimedia A not-for-profit organisation co-founded by Jimmy Wales. Wikimedia maintains several web sites including Wikipedia, Wikinews and Wikibooks.

Wikipedia A free open-content multi-lingual encyclopedia run my the Wikimedia foundation.

WYSIWYG What You See Is What You Get. A phrase used to describe the ideal in document editing, that the content will appear on the printed page (or other final format) as it does on screen (during editing).

Outline of Chapters

This research is presented in five chapters. Chapter two reviews literature in the areas of wikis, and online trust and reputation. It serves to introduce the field to the reader, to explore what is known in these areas, and to identify voids that research has yet to explore. These voids propose topics for the research detailed in later chapters.

Chapter three details the nature of the research being performed. The first half of chapter three outlines assumptions, limitations, the research questions being studied, and the methods for achieving the goals of the research. The second half explains and expands the technical aspects of the tools used in the research, as well as the reasons for their selection.

Chapter four presents the data, analysis and results of the research, discussions on how the data is interpreted, and explanations of their relevance and importance.

Chapter five summarises the process taken in this research, and provides a broad discussion and conclusions based on the research. It summarises the results of chapter four, and provides interpretations and limitations of these findings. Finally it suggests avenues for further research.

Personal tools