Pages:
1
2 |
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
Download an open forum backup!
Sciencemadness.org is now offering for the first time what I hope will become a standard fixture in online chemistry communities: an open, freely
available offline archive of messages and files from this discussion board. This is only a first release, and it has a few glitches and rough edges to
work out. Here's what it already has to offer:
-An archive of static HTML copies of forum index pages from all sections but Whimsy and Detritus (Whimsy may be added at a later date)
-An archive of static HTML copies of all threads from all sections but Whimsy and Detritus (Whimsy may be added at a later date)
-Modifications to make the static HTML indices refer to the static HTML threads, and to make all static HTML pages use local copies of graphics files
and attachments
-An optional media archive containing copies of all attachments and inline images from threads
Here's some of the rough edges that I hope to address in the future:
-User-added links from one thread to another still refer to the online site, not the offline archive; there are also other opportunities to rewrite
links for local use
-There's considerable page clutter that I can and should remove before the next release; there's no use for Post New Topic, Today's
Posts, etc. in an offline archive
-A disturbingly large fraction of attachments seemed to download with errors; I'm not sure if this is a problem with my archiving software, the
board, or the original uploads
-All threads appear as single HTML files; this is taxing to browsers/computers on large threads
The base archive, containing board icons/graphics and threads, can be found here: http://www.sciencemadness.org/archive/sm_main.zip, 21,840,110 bytes.
The media archive, containing inline images and attachments, can be found here: http://www.sciencemadness.org/archive/sm_media.zip, 92,212,812 bytes.
The media archive is still uploading from my home machine, so I would suggest waiting a couple of hours before attempting to download it. After you
have downloaded the main archive or both archives, unzip them and point your web browser at index.html to begin enjoying your offline copy of the
forum. Depending on how heavily people download these files, I may make them available all the time, or for only a limited time window near the end of
each month.
I will continue to upload encrypted database dumps from time to time, since those are easier to use for board recovery, but this is the archive you
want to download if you've ever feared losing something from the forum, or if you want to refer to it even when you're not on the internet,
or if you'd like a local copy to search/analyze/whatever.
Please let me know of any glitches you encounter or enhancements you'd like to see in this thread. Enjoy!
PGP Key and corresponding e-mail address
|
|
chemoleo
Biochemicus Energeticus
Posts: 3005
Registered: 23-7-2003
Location: England Germany
Member Is Offline
Mood: crystalline
|
|
Thank you very much Polverone! I see all this heavy bandwidth is being put to *good* use!
I tested it out a little.
I guess the major problem I could see is that the linkage is relative to *your* computer system. I.e. when I load up index.html, and load up a
subforum, to click on a thread, it produces i.e. this link file:///extra/sciencegrab/chemistry_in_general/0000023.html
This is of course not the directory I installed this into, so clicking this link produces nothing. But it shouldnt be a problem to fix as the links
between index.html and the different subdirectories work fine.
Very nice work otherwise, btw. I wonder how you combined several paged threads into a single html.
Never Stop to Begin, and Never Begin to Stop...
Tolerance is good. But not with the intolerant! (Wilhelm Busch)
|
|
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
Ugh, you're right. I obviously made a typo in preparing that section. I have uploaded a zip file with just the fixed index information:
http://www.sciencemadness.org/archive/fixedindices.zip
I am also uploading a fixed version of the sm_main archive. Edit: the fixed version is now in place. Anyone who downloads sm_main.zip now
should get the correct index information.
It was very easy to make the threads into one long piece: I created a new user account named archiver, went to my control panel, and had it show 1000
posts per page (longer than any existing thread). Then I just had my script log in as archiver and grab each of the one-page threads.
I must thank everyone who contributed financially to sciencemadness. This sort of project would not have been possible on the old site, due to the
much more limited bandwidth and disk space.
[Edited on 4-30-2005 by Polverone]
PGP Key and corresponding e-mail address
|
|
Ramiel
Vicious like a ferret
Posts: 484
Registered: 19-8-2002
Location: Room at the Back, Australia
Member Is Offline
Mood: Semi-demented
|
|
Both backups downloaded.
Caveat Orator
|
|
The_Davster
A pnictogen
Posts: 2861
Registered: 18-11-2003
Member Is Offline
Mood: .
|
|
I downloaded both of them, but when I attempt to extract the media zip file I get several errors and the file into which I extracted them is empty.
Anyone else have this problem?
|
|
Rosco Bodine
Banned
Posts: 6370
Registered: 29-9-2004
Member Is Offline
Mood: analytical
|
|
Backups are a great idea !
In these times of troubling disappearances of websites and data
and discussions , particularly of obscure
or not well known information of the nature which makes such knowledge
" SENSITIVE IN NATURE " ........
then under such circumstances , the free distribution of such information spread far and wide is an effective countermeasure for the censors and
" thought police " who
are doing their tyrannical best to keep people ignorant subjects whose extent of
knowledge is limited only to what they are
deemed " authorized " to know .
Any small victory against those Orwellian ,
Machiavellian fascists , is a worthy accomplishment .
And that is the larger matter which should govern us all in these times , seeing what
has happened with the hive , and the direction things seem to be going for E&W also , the priority should be to preserve hard gotten data
assembled in such ways
nowhere else on earth , and guarantee that informations continued availability ,
as much so as if it were Winchesters being
passed out to the pioneers , as they circle the wagons and see what the savages are going to do to interfere with progress .
|
|
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
Hi rogue chemist, I just tested downloading the media file and unzipping it. I had no problems. Are you sure the file you downloaded is exactly
92,212,812 bytes in size? Its md5 checksum is a109745ae876bdff3a9273b1b01ed225.
[Edited on 4-30-2005 by Polverone]
PGP Key and corresponding e-mail address
|
|
The_Davster
A pnictogen
Posts: 2861
Registered: 18-11-2003
Member Is Offline
Mood: .
|
|
I tried re-downloading it, the first time my wireless connection died on me which could have caused some corruption. In any case it works fine now, I
still get a few errors during unzipping, but the files work now, so all is good.
Now to save this archive to disk.
|
|
Rosco Bodine
Banned
Posts: 6370
Registered: 29-9-2004
Member Is Offline
Mood: analytical
|
|
Dowloaded the backup quick and easy ,
and got no errors on decompressing the zip files .
Everything appears to work perfectly ,
navigation and page loading is instantaneous ....
Never seen the forum work so fast
Nothing like a data drive for a local file server , and it would probably be quick
even on a CD .
Oh , just a reminder to anybody having any problems , it can be a firewall glitch on your local machine , being spoofed by explorer activity and
blocking the unrecognized activity which may be blocked as suspect " traffic " . If you
have any trouble check your firewall allow settings or turn off filtering .
|
|
chemoleo
Biochemicus Energeticus
Posts: 3005
Registered: 23-7-2003
Location: England Germany
Member Is Offline
Mood: crystalline
|
|
Works great here, too, including attachments and pictures!
Even pictures stored elsewhere were grabbed, which is great because once those sites go down this data isn't irretrievably losts!
One minor issue - I noticed that, once browsing in actual threads, trying to go back by clicking Sciencemadness Discussion Board
(file:///.../sciencemadness%20html/sciencegrab/energetic_materials/index.php), or Organic chemistry (i.e. the forum,
file:///.../sciencemadness%20html/sciencegrab/organic_chemistry/forumdisplay.php?fid=10) or whatever doesn't work - the file is not found, so
internal crossreferencing by board-links is seemingly not applied to all internal links.
Essentially this can be avoided by using the backbutton of course.
Maybe there's an easy fix for this. Although it's not essential, so all is good.
Never Stop to Begin, and Never Begin to Stop...
Tolerance is good. But not with the intolerant! (Wilhelm Busch)
|
|
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
It's correct that I made no effort to fix those additional links. I will add that fix in a future release.
PGP Key and corresponding e-mail address
|
|
Rosco Bodine
Banned
Posts: 6370
Registered: 29-9-2004
Member Is Offline
Mood: analytical
|
|
One feature I would like to see enabled
is the " printable version " view , since
that makes it much easier to capture
and export any text .
Saves a lot of ink when you want to print
something too .
[Edited on 2-5-2005 by Rosco Bodine]
|
|
MadHatter
International Hazard
Posts: 1339
Registered: 9-7-2004
Location: Maine
Member Is Offline
Mood: Enjoying retirement
|
|
Backups
Both backups downloaded. Thanks, Polverone !
From opening of NCIS New Orleans - It goes a BOOM ! BOOM ! BOOM ! MUHAHAHAHAHAHAHA !
|
|
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
New backups are now ready for download under the same names as before, sm_main.zip and sm_media.zip. There is unfortunately no easy way to offer
incremental updates at the present time. Links have been improved in this version. The visual appearance has been cleaned up too. Finally, the archive
now includes printable versions of the threads.
One oddity that you may notice with this archive or the last is that the index pages can be slightly more up to date than the actual threads. For
example, a thread may be listed as having three replies but you only see one when you click on the thread. This is a bit of a wart but not actually a
bug; it's due to the way the archiver caches threads but always downloads fresh index pages.
PGP Key and corresponding e-mail address
|
|
Axt
National Hazard
Posts: 795
Registered: 28-1-2003
Member Is Offline
Mood: No Mood
|
|
One of the things I think would improve the search capabilities is using the topic title as the "page title".
For example if one searches through windows, or google for a word, the page is always named "Sciencemadness Discussion Board - Powered by XMB 1.8
Partagium Final S..". On other forums the topic title becomes part of the "page title" so its easy to identify threads.
Another example, look at the top title bar of the page <a
href="http://www.sciencemadness.org/talk/viewthread.php?tid=3295">here</a>, compared to <a
href="http://www.xsorbit2.com/users/apcforum/index.cgi?board=general&action=display&num=1095230413">here</a>.
Since the search function doesnt work in the archive, this makes it hard to use "3rd party" search engines on the archive, as they only pull
up the page title.
Other then that .... great
|
|
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
New backups are now ready for download under the same names as before, sm_main.zip and sm_media.zip. There is unfortunately no easy way to offer
incremental updates at the present time.
PGP Key and corresponding e-mail address
|
|
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
New backups are now ready for download under the same names as before, sm_main.zip and sm_media.zip. There is unfortunately no easy way to offer
incremental updates at the present time.
Later I'm going to try to take the forum down for a bit while I upgrade the software, so you might want to take the opportunity to download an
offline copy now.
[Edited on 12-4-2005 by Polverone]
PGP Key and corresponding e-mail address
|
|
MadHatter
International Hazard
Posts: 1339
Registered: 9-7-2004
Location: Maine
Member Is Offline
Mood: Enjoying retirement
|
|
Backups
Both now in the UPLOAD folder on my FTP.
From opening of NCIS New Orleans - It goes a BOOM ! BOOM ! BOOM ! MUHAHAHAHAHAHAHA !
|
|
Nerro
National Hazard
Posts: 596
Registered: 29-9-2004
Location: Netherlands
Member Is Offline
Mood: Whatever...
|
|
This is a quick reply
#261501 +(11351)- [X]
the \"bishop\" came to our church today
he was a fucken impostor
never once moved diagonally
courtesy of bash
|
|
wa gwan
Harmless
Posts: 37
Registered: 15-4-2005
Member Is Offline
Mood: No Mood
|
|
Is there another backup coming soon?
|
|
Rosco Bodine
Banned
Posts: 6370
Registered: 29-9-2004
Member Is Offline
Mood: analytical
|
|
Yesterday I noticed some error message script superimposed on the image of the main page
and a few glitches otherwise which were a transient
problem ...... and remembering some connectivity
problems not too long ago the two things made
me a bit nervous and caused me to wonder about
how up to date is the present backup .
There's been a lot of interesting information and discussion added since the last known backup which really should be protected , archived data ,
secured
by an up to date backup .
So please .....at the earliest opportunity ,
let's get an updated backup . It's cheap insurance
and peace of mind .
|
|
solo
International Hazard
Posts: 3975
Registered: 9-12-2002
Location: Estados Unidos de La Republica Mexicana
Member Is Offline
Mood: ....getting old and drowning in a sea of knowledge
|
|
I have a question do all the articles that have been uploaded become part of the back up? Also is there a file folder where all of the uploaded
articles reside .......and can they be accessed by members? the reason for asking is because there is an awful lot of citations being uploaded and no
way to see and index of what's available.....at WD I keep a folder for all the references ever requested and fulfilled and their upload link available
for future researchers also to avoid reinventing the wheel...........solo
It's better to die on your feet, than live on your knees....Emiliano Zapata.
|
|
Polverone
Now celebrating 21 years of madness
Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline
Mood: Waiting for spring
|
|
I do like having the cheap insurance of a backup, but doing the sort of transformations that are necessary to make the offline archive presentable and
navigable is a bit painful. In case someone with relevant programming experience is reading: I am using the Python module BeautifulSoup to locate
elements in each page (e.g. the "New Topic" button) and then I do string replacements to delete or alter elements (operating on an entire page as one
large string). The problem is that the strings returned by BeautifulSoup may not be exactly the same sequence of characters that appeared in the web
page -- whitespace may be changed. Each one of these discrepancies must have a special case in the code, which is ugly and time-consuming to develop.
I did it once, but several months later the forum software was upgraded and the work needed to be done again. I still haven't re-done this work.
Solo: yes, attachments are downloaded and stored by the code. You would still need an indexing system to go with them, because the file names may be
something uninformative like "068374_methanol.pdf", where the numerical prefix is the number of the post that the attachment was found in.
PGP Key and corresponding e-mail address
|
|
Waffles
Hazard to Others
Posts: 196
Registered: 1-10-2006
Member Is Offline
Mood: No Mood
|
|
Quote: | Originally posted by prica
X > B.D.
I cane Send vaglias,but if ya came doon this parts, you're my guest(x2 p. 2weeks).Augh ! |
WHY DO THESE PEOPLE THINK THAT WE UNDERSTAND THEM
THIS IS NOT LANGUAGE
\"…\'tis man\'s perdition to be safe, when for the truth he ought to die.\"
|
|
gambler
Harmless
Posts: 44
Registered: 30-6-2006
Member Is Offline
Mood: No Mood
|
|
Is there plans in the mist to prepare a current open forum backup?
Thankyou in advance
|
|
Pages:
1
2 |