Welcome to our website. It is generaly simplier version of wikipedia. You will find there selected articles. Enjoy!
| This essay contains the advice or opinions of one or more Wikipedia contributors. Essays may represent widespread norms or minority viewpoints. Consider these views with discretion. Essays are not Wikipedia policies. |
|
Although Wikipedia, today, has fewer than 4 million articles (plus millions of red-link articles), the total revisions number nearly 536 million. Currently, the article count is 3,955,966 articles, with 535,095,848 total revisions, giving an average of 135.26 revisions per article. The increase is over 1 million revisions per week.
Pruning, or erasing (deleting), minor revisions might require modifying Wikipedia policies or enhancing the underlying MediaWiki software. However, revisions could also be pruned in advance, by several actions each user could take to avoid long-term storage of more revisions.
You may wonder if it even matters that Wikipedia has such a high number of revisions. But there are some issues.
First, there is a cost to the numerous revisions. The Wikimedia Foundation, which operates Wikipedia, is non-profit organization. Its continued operation relies solely on private donations. Without these donations, Wikipedia could not exist.
There is a cost to each edit, each revision stored, and each time a page is viewed even without editing. Though the cost of each one is minute, with millions each day (including about 140,000 edits), these all add up. You are not discouraged from reading or editing Wikipedia for this reason. That's what this is for, and funds have been allocated for this purpose, provided they are used responsibly. Simetrical, a MediaWiki developer, estimated in 2007 that storing "2,000 [trivial new edits] a day would amount to a whole $3.65 per year"; storage costs have since decreased further.
The quality of writing is also at stake. The goal of Wikipedia is to have a well-written encyclopedia. This cannot be accomplished in one day, or in a single edit for each article. But an edit history that is clogged with experimental or "junk" edits may become confusing, and versions that fail to meet encyclopedic standards may in the long run have a negative effect, particularly if they are not marked with tags to indicate their temporary status.
An editor who makes multiple edits to an article in an attempt to achieve his/her final plan may view any edits before the final one as temporary revisions that will not remain very long. Sometimes, it is not easy or even possible to get the permanently planned revision made in a single edit. This can be the case when the edit contains a huge amount of information, or when it is difficult to enter all the text at once.
It is well-recognized that Wikipedia is a work in progress and there is no deadline for completion. But it is very possible that the temporary versions will be viewed by one or more people in the interim.
Articles vary as to how often they are viewed. Pageview stats can provide this information. If an article is viewed around 6000 times a day, that amounts to approximately 4 views every minute. If it takes one minute between these multiple edits, that means about 4 people viewed what was not intended to be the permanent version. This is not a problem when the article is simply incomplete, but may be an issue when it contains seriously out-of-date or otherwise inaccurate information, especially about a living person or otherwise sensitive subject.
Some of the actions that could be performed, by individual users, to prune both the existing and future revisions would be:
Each of the above actions is elaborated below, to explain the particular techniques for reducing revisions.
Many new users simply do not realize they can combine their edits as one SAVE, using a longer edit-summary line. Users often just change a phrase & save, change a phrase & save, etc. Some users don't even realize how the extra revisions pile-up, expanding as a long list under the History tab of past revisions. Several simple steps could be taught:
I've noticed several new users, while greatly expanding a low-traffic article, will keep saving every new phrase as though other users might pounce, at any minute, on the revised article before the next desperate SAVE is made. Many users don't understand how to watch the History listing to see if an article is quiet now, unchanged for weeks, and can be safely edited for hours with no one else making changes.
Some kinds of vandalism go unnoticed for months, sometimes over 6 months, so there is no hurry to revert all hackings made in low-traffic articles. There are several issues to consider:
The tendency to rapidly revert vandalism, as though the whole world has stopped breathing, has made it difficult for other users to also enhance those articles while correcting hacked text. It is an utter myth that "Every article gets vandalized" (not true): many articles go years without ever being hacked. Depending on notability or libel concerns, many hacked articles could wait (a long time) to be fixed while improving.
It would be a lot easier to warn new users, early on, that just because they SAVE, after every change, as a new revision, those changes are not safer or, somehow, more permanent. No, other users can simply revert all those revisions, reversing days/weeks of edits. The only permanent change is bloating Wikipedia ("forever") with numerous edits, all of which get reverted. Some issues to consider:
Not every group of other editors is cooperative; sometimes, cliques of other users can act like inner-wiki ("inner-city") gangs that live by their code, while outsiders receive pre-calculated treatment. There might be no safe way to save revisions, or get talk-page consensus, under those circumstances. Either move on to other articles, or else, contact an admin or WikiProject that might help the balance of power.
Plan some enhancements to articles by using offline files of either notes or potential new text. It might help to explain to users that creating an offline version of an article can allow more time for completing broad revisions, but then simply merge, into that new text, any other user changes that have been made meanwhile. Some issues to note:
For large updates, planning enhancements offline could reduce total revisions by a factor of 25 (or more), due to the focus on broad wording, while avoiding hacks to half-finished text by the tinkering of other users.
Beyond just planning enhancements, it can be easier to create entire articles offline. Some issues to note:
Note that offline storage does not have the Wikipedia backup protection, so be sure to make periodic copies, if needed, for backup.
Perhaps the easiest way to actually erase old revisions is to create a new article within the user-space for a particular user, then copy (not move) the article to become a brand-new entry in article-space. When the text is copied as a new article, then all those user-space revisions could be deleted. Some issues to note:
The article revisions will not be removed from Wikipedia until the user-space article has been deleted. After 2007, it became possible to restored a deleted article, so in the event of a major misunderstanding, any deleted article can be restored by an admin.
An effective way to actually erase many old revisions is to rename an old article, by creating the new name as a new article, then copy (not move) the article to become a brand-new entry in article-space. When the text is copied as a new article, then all those old revisions (under the old name) could be later deleted. Some issues to note:
Again, some people treasure the old stuff, like keeping decades of old, worn-out shoes, so not all articles can be pruned so easily by simply copy-renaming them into a clean, fresh name. The best compromise would be to keep an old history list in the talk-page or as a subpage "/old_history". However, in a fight to retain the detailed differences, it might be necessary to recreate the major old revisions by a series of repeated edits, filling the wiki-edit buffer with each successive major revision, and then saving each to allow comparisons of texts. Those recreated revisions should copy the edit-summary line from the actual old revisions, but also include text identifying each original user+date. Even though that re-creation, of major revisions, might seem extreme, many hundreds/thousands of minor revisions would be omitted, representing years of edits with just a few dozen recreated, major (non-trivial) revisions. Such a re-created article could retain the major detailed differences, showing how the article evolved, but omit the thousands of minor interim revisions that clutter a step-by-step viewing of each next revision.
Some of the Wikipedia bots, which perform repeated robotic edits to many articles, are focused on really tiny updates to articles, with some even correcting the grammar in vandalism jokes or within hacked text that will soon be removed. For example, a bot that capitalizes the word "english" to be "English" assumes that people never use the word in any other manner, such as "putting english" (a spin) on a billiard ball. It is highly debatable to allow bots to run rampant, and make opinionated conclusions about lowercase words (such as "english"). Of course, thousands of revisions could be generated by renaming "metre" to "meter" or some similar minor changes. Bots should be denied from making those minor changes.
The fixer-bots should be run on a rare basis, and perhaps even count how many corrections would be made to an article, then cleverly refuse to update an article just for a single minor word, unless the bot was running in a quarterly update-all-minor-issues mode. Let the bots analyze small problems, or even count the occurrences of lowercase "english" and such, but those bots should wait to fix minor issues, limiting severe precision to a few times a year, and then fix all problems with one revision to each article, at that later time.
Although techniques such as deleting user-space articles and copy-renaming old articles might seem extreme, the combined effect of thousands of users, each redoing articles in their specialty, could reduce portions of Wikipedia by 10x, 100x or 1000x times fewer revisions. After that point, users taught to combine numerous small edits (in the future), and avoid panic saving, would reduce the subsequent revision lists and omit most of the small stuff that cluttered Wikipedia during 2006–2008.
The overall effect would be simple to measure: just compare how the average-revision count fell, compared with:
Progress could be measured by comparing those averages against a future, anticipated reduction in average revisions per article.
Many revisions of Wikipedia articles are very minor and could be "pruned" from the overall History of article revisions, to leave only the more major revisions. Even if the minor revisions were actually left in storage, perhaps they could be bypassed when listing all other article revisions. However, actually erasing specific revisions from an article might require changing, or reconfiguring, the software behind Wikipedia, the MediaWiki system.
Perhaps 50% of all revisions are hackings/jokes + revert: it's not just the hacking of articles that escalates the total revisions, but the instant reverting that doubles the total revisions to recover from hacking.
If the Wikipedia storage system were altered, over half of all revisions could be erased from Wikipedia servers and no longer listed under the "History" tab of logged revisions:
Again, erasing or combining old revisions might require changing, or reconfiguring, the MediaWiki software behind Wikipedia.
Long lists of revisions showing hackings/jokes, each followed by instant-reverts, can cause the History listing of revisions to become much longer. Beyond a survey of prankster edits, there's not much use to long-term logging of a hacking+revert. It's analogous to bird droppings: it might be beneficial to notice some short-term patterns, such as when cars parked under some trees get bombarded with bird droppings, but it is less useful to record all the millions of bird droppings, everywhere in the world, in a giant database of history listings. Just forget about those bird droppings, rather than log each for eternity. Seek to have a "clean car", and focus on issues such as dents or tree limbs falling on a car.
The process of erasing a revision should require some special authority to avoid edit-wars that would use erase to negate other contributions. The 3-revert rule (WP:3RR) is intended to limit edit-wars during a one-day period. However, unrestricted use of erase could prolong edit-wars where revisions were removed without reverting.
Also, an erase could complicate the wikiserver databases, unless an erase was treated as a "delete" transaction in the storage. The database transaction log could handle the erase as a delete of that particular revision. Internally, the article history could record a hacking as a "Try" followed by an "Erase" so that the list of revisions would not show either transaction under the "History" tab when listing all the true revisions. However, long term, it might be more efficient if the hacking+erase really disappeared from the article storage at some point. Those combined "Try+Erase" transactions could be removed from the system after some period of time. Obviously, each transaction must be recorded, short-term, to allow backup/recovery of the edited articles, in the event of storage failure mid-way before the erase took effect.
If hackings cannot be erased, then perhaps they could be skipped in History listings. A special type of revert could indicate that the prior revision was merely hacked text, with the result that both the hacked+reverted entries would be skipped under a History listing. It can be very tedious, when stepping through multiple revisions, to see "Page blanked with" and then "Reverted" to show the entire text of the article changed in both revisions. Skipping over the hacked+reverted revisions would condense the History listing of many high-traffic articles by about 50% or more. However, numerous low-traffic articles have gone years without being hacked.