Tuesday, November 6, 2007

Urgh... no time...

I'm running out of time to do stuff. If you haven't noticed, by looking at my complete profile, I like starting lots of projects but often don't have the time to get all of them done. This blog may become of "those" projects unfortunately... I just don't have the time to edit all these old stories. I wish I did though because the OCR scans on these old papers sure has ruined the story's texts, which is very sad, very sad indeed. I am hoping that with the Federal Goverment does take this thing out of beta the resolve that and actually hire someone as an editor to fix the problem(s) and give the text files a real cleanup.

Tuesday, October 23, 2007

Banquet is Planned

Source:
Source:http://chroniclingamerica.loc.gov/ndnp-repository/ndnp:1263067/raw/ndnp:1263066/N69122pdfFile.pdf



Banquet is Planned


The Irish nationalists of this city will hold a reunion and picnic under the suspices of the Knights of Tara at Sebuetren park Sunday. The committee on arrangements is as follows: Dan Foster, chairman; Tom Alford, manager, John Morlsey, secretary, Con Dempsey, Pat Ward, Jack Barry, Mike Doogan, Pat Caasidy, Johnny Kane, Tom Shaughnessy and Joe Sullivan.


-------------


I'm not going to post the original OCR this time, but will just say that if you search for exact phrase Tom Shaughnessy in the Chronicling America, you probably won't find this article since the OCR butchered his name... but if you do the same search in this blog's search you'll find it. This is the exact type of reason this blog is being created. Using OCR versions of these pdfs just isn't good enough for really good searches since the OCR doesn't catch everything and translate it properly...

JAPANESE MINISTER OF HOUSEHOLD IS DEAD

Source:http://chroniclingamerica.loc.gov/ndnp-repository/ndnp:1263067/raw/ndnp:1263066/N69122pdfFile.pdf

JAPANESE MINISTER OF HOUSEHOLD IS DEAD
Prince Tomosada Iwakura Succumbs in Tokyo

TOKYO. April 1.—Prince Tomosada Iwakura, minister of the imperial household, died today. He was formerly vice grand chamberlain, privy councilor and director of the peerage.

He was born In 1851 and was the eldest son of the late Prince Iwakura, a
leading imperialist in the struggle that led to restoration.


------------------


FYI, below is how the original OCR version of the story above looks. After this post, I'll probably leave the OCR text versions out of my posts since it's redundant. I'm just putting it here so you can see how garbled the OCR translated everything... which explains why I'm posting here...

Who was Tomosada J.vakura ???... nuff said.


------------------



JAPANESE MINISTER OF
HOUSEHOLD IS DEAD


Prince Tomosada Iwakura Succumbs
in Tokyo

TOKYO. April 1.—Prince Tomosada
J.vakura, minister of the imperial
louw.hold, died today. He was formerly
vice grand chamberlain, privy
councilor and director of the peerage.
7-f«> was born In 1851 and ww the eldest
•on of the late Prince Iwakura, a
Joadins imperialist in the, struggle that
'*6 to restoration.

300,000 COAL MINERS STRIKE FOR MORE PAY

Source:http://chroniclingamerica.loc.gov/ndnp-repository/ndnp:1263067/raw/ndnp:1263066/N69122pdfFile.pdf

300,000 COAL MINERS STRIKE FOR MOR PAY

Union Officials Say Walkout Is
Merely Suspension of Work
Pending Adjustment

New Wage Scale Asks Increase
and Conferences Fail to
Bring Agreement

President Lewis Believes That
the Miners Will Gain
Their Demands

Indianapolis. March 31.—Three hundred thousand organized miners of the bituminous coal fields of Pennsylvania. Ohio, Indiana, Illinois, lowa. Missouri, Kansas, Oklahoma and Arkansas quit work at midnight pending settlement of a new wage scale.

Officers of the united mine workers of North America declared the walkout was not a strike, but merely a suspension of work because no wage scale had been made to replace the old scale, which expired with March. The miners demand an increase of pay, in some instanccs of 5 cents a tone, and in other instance more, with certain changes in working conditions.

Confidence was expressed by the operators that there would be no general coal famine, large supplies of fuel having been stored in anticipation of the walkout.

While the miners predict the suspension will be cut short by a prompt signing of wage scales, some of the operators maintain the mines may be kept closed for a month, or longer.

The first settlement came in an announcement from Brazil. Ind., the center of the Indiana block coal field, where the demand for a 5 cents increase was granted.


---------------


This first post of an article is here to show you the reason(s) I'm posting this blog. Take a look at the edited story that I just posted above after doing some manual edits. Now compare it with what's below, which is a copy and paste of the story from the original pdf file where the OCR butchered some of the words since the computer didn't recognize some of the lighter words or letters in words...


---------------



300,000 COAL
MINERS STRIKE
FOR MORE PAY

Union Officials Say Walkout Is
Merely Suspension of Work
Pending Adjustment

New Wage Scale Asks Increase
and Conferences Fail to
Bring Agreement


President Lewis Believes That
the Miners Will Gain
Their Demands


rNT'IANAiQIAS. March Sl.—Three
hundred thousand organized miners of
the Mtumlnous coal fields of Pennsylvania.
Ohio, Indiana, Illinois, lowa. Mlspouri,
Karusas, Oklahoma and Arkansas
quit nork at midnight pending settlement
of a new wage prale.
Officers of the united mine workers
of North America declared the tvalkout
was not a strike, but merely a suspen
«ion of work because no wage scale
had been made to replace the old scale,
which expired with March. The miners
demand an increase; of pay, in some in-
Btanccs of .*> cents, a ton. and in other
hiytances more, with certain changes in
workingconditions.
confidence wa? expressed by the
operators that ther^ would be no general
coal famine, large .supplies of furl
!iavinjc b*en stored in anticipation of
th*» walkout.
While the miners predict the suspension
will be cut short by a prompt
signing of wage scales, some of the
"[wraliirs maintain the mines may be
kept cloyed for a month, or longer.
The iirs=t settlement came in an announcement
from Brazil. Ind., the center.'
of the Indiana block coal field,
where, the df-niand for a 5 cents in-
'-rease was granted.

Hi there. This blog is about old news stories, mainly public domain stuff.

**PLEASE, PLEASE, PLEASE DO NOT CONSIDER THIS A SPLOG OR BLOG WORTH DELETION! PLEASE READ BELOW FOR MORE INFO ABOUT WHY I'M CREATING THIS BLOG, AND IT MAY MAKE MORE SENSE TO YOU AS TO WHY IT'S HERE!***

I am starting this blog as an attempt to post old news stories, mostly public domain ones that are in ancient newspapers and things like that. The primary reason I'm doing this is because this blog's search capabilities are, in my opinion, better than the search capabilities of the online homes of many of these old stories, and also because I plan to create mirrors of some of these stories on another website, and plan to use this blog to post links to the urls on that site.

I'll post links to the main files online where I'm finding the articles in each posting and a possibly a link to a mirror of each as well. I'm posting mirrors of this stuff because there's no telling how long the original links will stay up since some of them are to government websites that are considered beta at the moment, so the urls may move around at some point in the future...

A few weeks back, I came across the National Digital Newspaper Program and it's Chronicling America: Historic American Newspapers Beta Project.

This is a very awesome project. They are trying to create digital images in pdf and text format of public domain newspapers. This sort of stuff is very useful for many people like me that like to read about history and also for people doing college level studies on the past and news from the past.

One thing that I noticed when browsing around in the Chronicling America: Historic American Newspapers Beta Project is that while each page of the papers that were scanned are accessible as pdf and text files, it appears that the text files are really just straight out OCR translations of the scanned pdf files. What this means is that a lot of the stories when read as text files don't make a lot of sense or are garbled gibberish when the OCR didn't accurately translate the stories, or more often then not did accurately translate what it could, but read the page from left to right instead of in columns of texts like the newspapers were meant to be read. As a result, you can do searches in the Chonicling America Search Pages but all that you can search for is individual words since phrase searches may or may not work due to all the OCR mistranslations and moving around of texts.

What I'm going to try to do with this blog is create a copy of each news story in those Chronicling America Project, but actually correct the OCR's Text Translations, and sort of edit it to make it readable in English with the full phrases... then I'll post a link to the original pdf source where I got the story, and also the mirror of the pdf if and when I get the mirrors uploaded to my own website (I'll be using quatoless, at least at first, as a free host for the mirrors since they basically offer near unlimited storage and bandwidth for free). This will be a time consuming chore, but is a worthwhile effort, if for no other reason because of the fact that it'll allow people working on scholarly research papers and things trying to dig up more info from the Chonicling America to search this blog and get the info they want to find without having to deal with the sometimes archaic search results you'll get on the Project's main site. Eventually, I may expand this blog to other, mostly public domain, news sources as well, but for now, this is a huge project, so I'm going to try to stay focused on it. I'll mostly only be able to work on this on weekends, so if you want to do similar with your blog, please post a reply here and link your blog here. This is a worthy, educational project to undertake. It's not splog or anything, even though it may have some similarities in some ways. Hopefully you can see why it's a worthwhile effort.... and won't report it as a blog that needs to be deleted since it doesn't need to be. Thanks.