Welcome to offline-wikipedia project !

News

05/20 Packages for Ubuntu hardy heron and gutsy gibbon available
We are proud to announce that the Beta version of Offline wikipedia is now available for Ubuntu and soon for other distributions.
We are looking for people that are able to package applications for Fedora, Mandriva, Open Suse and others ... For more information please email us at offlinewikipediaproject_[At-No_SpAm]gmail_[dot SpAm] com .

Project presentation

This project aims to provide a very effective way to get an offline version of wikipedia. This is very useful for people that don't have access to wikipedia. All page titles are indexed using xapian indexor. For instance, with english wikipedia (based on march's dump), about 2 millons page titles are indexed, it takes less than 0.1 sec to make a search.

Features

Offline-wikipedia allows to :

Some Benchmarks

Intel Core 2 Duo E6550@2.66GHz, 2 GB of RAM
Article Name Time to search(1) Time to extract(2) Time to render HTML(3)
United States 0.1194 sec 0.199 sec 10.5686 sec
Earth 0.0366 sec 0.0657 sec 4.5826 sec
Moon 0.0345 sec 0.0666 sec 3.2055 sec
France 0.03 sec 0.066 sec 1.5835 sec
Bird 0.0311 sec 0.0638 sec 6.7544 sec
NATO 0.0486 sec 0.1305 sec 3.286 sec
Big Bang 0.0041 sec 0.0727 sec 2.045 sec
General relativity 0.0064 sec 0.1308 sec 3.727 sec
Philosophy 0.0301 sec 0.0645 sec 1.4277 sec
Evolution 0.0237 sec 0.0586 sec 7.2471 sec
These preliminary results are based on non-optimized version (beta version), we hope to improve the rendering process later.

Note:
(1) We measure the time to make a search within indexed page titles (using xapian engine).
(2) We measure the time to extract the page content.
(3) We measure the time to transform wikimarkup to HTML.

Requirements

Offline-wikipedia server requires with its dependencies at least 50 MB. Wikipedia content, articles and indexes require 5,6 GB for english dump (march's version).

It works on legacy hardware and low cost laptop (Asus EEE PC), but for good performance, we recommand a 2 years old computer.

Installation

This operation is divided into 2 steps. First you download and install Offline-Wikipedia Server and then you install Offline-Wikipedia Content. Unfortunately, for the moment, we can't provide the content, because we are looking for an online file storage service to host modified wikipedia dumps. For the moment, if you really want to install offline-wikipedia, you have to install owi-server package and then build by yourself the content.

Server Installation

Dependencies

This project depends on this packages :

Ubuntu Hardy Heron 8.04

So you have to install project dependencies by typing in a terminal :

sudo apt-get install php5-cli libxapian15 python-django imagemagick texlive-latex-base texlive-base-bin

You have to download the offline-wikipedia server at Download Page. Just run

 sudo dpkg -i owi-server_0.1b1-ubuntu8.04-1_i386.deb

Ubuntu Gutsy Gibbon 7.10

On Ubuntu Gutsy Gibbon, you have to install the last version of xapian :

sudo wget -O- http://www.xapian.org/debian/archive_key.asc | sudo apt-key add -

Enter your user password

sudo apt-get update

This command has to return the version 1.0.5 not the 1.0.2.

apt-cache show libxapian-dev
So you have to install project dependencies by typing in a terminal :

sudo apt-get install php5-cli libxapian15 python-django imagemagick texlive-latex-base texlive-base-bin

Then, you have to download the offline-wikipedia server at Download Page. Just run

 sudo dpkg -i owi-server_0.1b1-ubuntu7.10-1_i386.deb

In order to get the database, you have 2 possibilities, make by yourself the indexed database from official dump or download the indexed database.

Content Installation

Build by yourself the content

1) Get the source code : (you need subversion)
svn co https://owi.svn.sourceforge.net/svnroot/owi/trunk owi-server
2) Install and compile xapian index engine :
sudo apt-get install libxapian-dev
Compile xapian program :
cd owi-server/owi-index_search
make xapian
3) Go into install directory, edit the Makefile (version is available at http://download.wikipedia.org/backup-index.html, it corresponds to a dump release date )

VERSION = 20080312
LANG = en

Launch the whole process :

make content
Warning : on slow system, it can take very long time !

Team

RoleUser Information
Project Administrator & DeveloperJerome CHARLOT 'bsheep'

Contact

If you want additional information, send me an email at offlinewikipediaproject_[At-No_SpAm]gmail_[dot SpAm] com.

Developper information

How to get the source

You have to install subversion and just type :

svn co https://owi.svn.sourceforge.net/svnroot/owi/trunk owi-server

Or browse the source code at http://owi.svn.sourceforge.net/viewvc/owi/ .

Credits

I'd like to acknowledge Thanassis Tsiodras for his great ideas for this project and Fslab team for modifying the mediawiki parser for an offline use.