Welcome to offline-wikipedia project !
05/20 Packages for Ubuntu hardy heron and gutsy gibbon available
We are proud to announce that the Beta version of Offline wikipedia is now available for Ubuntu and soon for other distributions.
We are looking for people that are able to package applications for Fedora, Mandriva, Open Suse and others ... For more information please email us at offlinewikipediaproject_[At-No_SpAm]gmail_[dot SpAm] com .
This project aims to provide a very effective way to get an offline version of wikipedia. This is very useful for people that don't have access to wikipedia. All page titles are indexed using xapian indexor. For instance, with english wikipedia (based on march's dump), about 2 millons page titles are indexed, it takes less than 0.1 sec to make a search.
Offline-wikipedia allows to :
Intel Core 2 Duo E6550@2.66GHz, 2 GB of RAM | |||
Article Name | Time to search(1) | Time to extract(2) | Time to render HTML(3) |
United States | 0.1194 sec | 0.199 sec | 10.5686 sec |
Earth | 0.0366 sec | 0.0657 sec | 4.5826 sec |
Moon | 0.0345 sec | 0.0666 sec | 3.2055 sec |
France | 0.03 sec | 0.066 sec | 1.5835 sec |
Bird | 0.0311 sec | 0.0638 sec | 6.7544 sec |
NATO | 0.0486 sec | 0.1305 sec | 3.286 sec |
Big Bang | 0.0041 sec | 0.0727 sec | 2.045 sec |
General relativity | 0.0064 sec | 0.1308 sec | 3.727 sec |
Philosophy | 0.0301 sec | 0.0645 sec | 1.4277 sec |
Evolution | 0.0237 sec | 0.0586 sec | 7.2471 sec |
Offline-wikipedia server requires with its dependencies at least 50 MB. Wikipedia content, articles and indexes require 5,6 GB for english dump (march's version).
It works on legacy hardware and low cost laptop (Asus EEE PC), but for good performance, we recommand a 2 years old computer.
This operation is divided into 2 steps. First you download and install Offline-Wikipedia Server and then you install Offline-Wikipedia Content. Unfortunately, for the moment, we can't provide the content, because we are looking for an online file storage service to host modified wikipedia dumps. For the moment, if you really want to install offline-wikipedia, you have to install owi-server package and then build by yourself the content.
This project depends on this packages :
sudo apt-get install php5-cli libxapian15 python-django imagemagick texlive-latex-base texlive-base-bin
You have to download the offline-wikipedia server at Download Page. Just run
sudo dpkg -i owi-server_0.1b1-ubuntu8.04-1_i386.deb
On Ubuntu Gutsy Gibbon, you have to install the last version of xapian :
sudo wget -O- http://www.xapian.org/debian/archive_key.asc | sudo apt-key add -
Enter your user password
sudo apt-get update
This command has to return the version 1.0.5 not the 1.0.2.
apt-cache show libxapian-devSo you have to install project dependencies by typing in a terminal :
sudo apt-get install php5-cli libxapian15 python-django imagemagick texlive-latex-base texlive-base-bin
Then, you have to download the offline-wikipedia server at Download Page. Just run
sudo dpkg -i owi-server_0.1b1-ubuntu7.10-1_i386.deb
In order to get the database, you have 2 possibilities, make by yourself the indexed database from official dump or download the indexed database.
svn co https://owi.svn.sourceforge.net/svnroot/owi/trunk owi-server2) Install and compile xapian index engine :
sudo apt-get install libxapian-devCompile xapian program :
cd owi-server/owi-index_search3) Go into install directory, edit the Makefile (version is available at http://download.wikipedia.org/backup-index.html, it corresponds to a dump release date )
make xapian
VERSION = 20080312
LANG = en
Launch the whole process :
make contentWarning : on slow system, it can take very long time !
Role | User Information |
Project Administrator & Developer | Jerome CHARLOT 'bsheep' |
If you want additional information, send me an email at offlinewikipediaproject_[At-No_SpAm]gmail_[dot SpAm] com.
You have to install subversion and just type :
svn co https://owi.svn.sourceforge.net/svnroot/owi/trunk owi-server
Or browse the source code at http://owi.svn.sourceforge.net/viewvc/owi/ .
I'd like to acknowledge Thanassis Tsiodras for his great ideas for this project and Fslab team for modifying the mediawiki parser for an offline use.