Get Usenet News Articles Using REXX, Network News Transfer Protocol and Sockets
Автор: Dave Briccetti's
Дата: 1995
Источник: Way Back Machine
This tip demonstrates retrieving selected news articles from an NNTP server using REXX, NNTP, and TCP Sockets.
If you are a REXX, NNTP, or Sockets expert and you see any errors or possible improvements to this tip, please share your knowledge with me.
As a software developer and contract programmer, I like to keep up with available contracts by searching Usenet newsgroups, especially ba.jobs.contract. I got tired of starting up my newsreader, opening the group, searching for "OS/2," and then weeding out those postings which say "W2ONLY." I wanted to click on an object and have it all done for me automatically. So I wrote this REXX program.
Newsreaders often get the news articles from a program known as a News Server, which runs on a host somewhere. Your Internet service provider or system administrator should give you the name of the server to use. Newsreaders communicate with the news server using the Network News Transfer Protocol, or NNTP.
To use this program, which is called GetNews, type the following from the command line (or set up program objects to supply your frequently used options):
GetNews NewsGroup SearchString [ExcludeString]
NewsGroup is the name of the newsgroup you want to search, such as ba.jobs.contract.
SearchString is the string you are searching for, such as OS/2. All articles containing your search string in the Subject: line will be considered for retrieval.
ExcludeString is an optional string, which, if found in the body of the article, will cause the article to be skipped.
Here is an example:
GetNews ba.jobs.contract OS/2 W2ONLY
This will retrieve all articles from ba.jobs.contract with subjects containing "OS/2" and where the string "W2ONLY" does not appear anywhere in the article.
Here's part of the result of running the program as in the above example, with commentary interspersed. The lines in the larger font are the NNTP commands we are sending.
200 shellx.best.com InterNetNews NNRP server INN 1.4 22-Dec-93 ready (posting ok).
This is what the news server says when we connect to it. The 200 code tells us that all is well and we can continue.
GROUP ba.jobs.contract
Here we select the newsgroup.
211 3614 72340 75980 ba.jobs.contract
This response (code 211) tells us that the server accepted the GROUP command, and gives the number of articles, the starting article number, and the ending article number of the group.
XPAT SUBJECT 1- *OS/2*
We request a list of all articles whose subject contains the string "OS/2."
221 subject matches follow.
72392 US-CA-San Fran-MGR-OS/2 WARP-Recruiter
72440 US-CA-San Fran-OS/2 Testing Engineer-RecruiterChen & McGinley, Inc.
72491 US-CA-San Fran-QA-OS/2, Testing-Recruiter
72563 US-CA-Oakland-LAN-LAN WAN TCP/IP OS/2 NT-Recruiter
72567 US-CA-Oakland-LAN-LAN WAN TCP/IP OS/2 NT-Recruiter
.
The search results appear, followed by a line containing only a period, which identifies the end of the data.
BODY 72392
Now we ask for the body of the first message.
222 72392 <80652395268020@dice.com> body
SEARCH KEYS: TYPE:MGR TERM:CON W2ONLY STATE:CA AREA:415
POSITION ID: ARCSF.011
DATE POSTED: 12/01/95
POSITION TITLE : PROJECT MANAGER
SKILLS REQUIREMENTS: OS/2 WARP
LOCATION : SAN MATEO
START DATE: 12/20/95
PAY RATE : NEGOTIABLE (+ benefits
LENGTH : 1 year
COMMENTS: Manage large-scale, multi-site rollout of high powered
superstation desktops. Very interesting, high profile
position.
.
And here it is, again ending with the period.
QUIT
We indicate we're finished.
205
The server accepts the command and ends the session.
The complete REXX program follows. You can also get it in zipped form.
/* ===========================================================================
Get Usenet News Articles Matching Search Criteria, Using Network News Transfer Protocol
RFC977 (https://www.cis.ohio-state.edu/htbin/rfc/rfc977.html) (или здесь) and the Draft Common NNTP Extensions(ftp://ftp.internic.net/internet-drafts/draft-barber-nntp-imp-01.txt)
Written by a novice REXX programmer
Dave Briccetti, December 1995
daveb@davebsoft.com, https://www.davebsoft.com
May be used for any purpose
=========================================================================== */
parse arg NewsGroup SearchString ExcludeString
if NewsGroup = '' | SearchString = '' then
do
say 'usage: GetNews NewsGroup SearchString [ExcludeString]'
say 'example: GetNews ba.jobs.contract OS/2 W2ONLY'
say ' shows all postings in ba.jobs.contract with OS/2 in the'
say ' subject line and without the string 'W2ONLY' in the body'
exit
end
OutFile = 'results.' || NewsGroup /* Change this if you don't have long file names */
/*OutFile = 'search.txt'*/
NewsServer = 'your.news.server' /* News server */
SearchField = 'subject' /* Article header field to search */
TRUE = 1
FALSE = 0
REPLYTYPE_OK = '2' /* NNTP reply code first byte */
/* Load the REXX Socket interface */
call RxFuncAdd 'SockLoadFuncs', 'rxSock', 'SockLoadFuncs'
call SockLoadFuncs
'@if exist' OutFile 'del' OutFile
if EstablishProtocol() = FALSE then
exit
/* Get the postings */
call GetPostings socket, NewsGroup, SearchField, ,
SearchString, ExcludeString, OutFile
/* End the protocol with QUIT */
CmdReply = TransactCommand(socket, 'QUIT', 1, '0d0a'x)
/* Close the socket */
call SockSoClose socket
exit
/* ======================================================================== */
EstablishProtocol:
/* ======================================================================== */
socket = ConnectToNewsServer(NewsServer)
if socket <.= 0 then
do
say 'Could not connect to news server'
return FALSE
end
CmdReply = GetCmdReply(socket, '0d0a'x)
say CmdReply
if left(CmdReply, 1) \= REPLYTYPE_OK then
do
say 'Could not establish protocol'
return FALSE
end
return TRUE
/* ======================================================================== */
GetPostings: procedure
/* ======================================================================== */
parse arg socket, NewsGroup, SearchField, SearchString, ExcludeString, OutFile
CRLF = '0d0a'x
Dot = CRLF || '.' || CRLF
REPLYTYPE_OK = '2' /* NNTP reply code first byte */
CmdReply = TransactCommand(socket, 'GROUP' NewsGroup, 1, CRLF)
parse var CmdReply code num first last group
if left(code, 1) = REPLYTYPE_OK then
do
CmdReply = TransactCommand(socket, ,
'XPAT' SearchField '1- *' || SearchString || '*', 0, Dot)
if left(CmdReply, 1) \= REPLYTYPE_OK then
do
say 'xpat failed'
return FALSE
end
CmdReply = StripFirstLine(CmdReply)
call lineout OutFile, CmdReply
do while length(CmdReply) >. 5
line = GetFirstLine(CmdReply)
if line = '' then
CmdReply = ''
else
do
CmdReply = StripFirstLine(CmdReply)
parse var line num rest
body = TransactCommand(socket, 'BODY' num, 0, Dot)
if ExcludeString = '' | (pos(ExcludeString, body) = 0) then
do
From = HeaderLine(socket, 'from')
call lineout OutFile, From
Subject = HeaderLine(socket, 'subject')
call lineout OutFile, Subject
BodyStripped = StripFirstLine(body)
call lineout OutFile, BodyStripped
end
end
end
end
return
/* ======================================================================== */
ConnectToNewsServer: procedure
/* ======================================================================== */
parse arg NewsServer
socket = 0
/* Open a socket to the news server. (The Sock* functions are
documented in the REXX Socket book in the Information folder
in the OS/2 System folder */
call SockInit
if SockGetHostByName(NewsServer, 'host.!') = 0 then
say 'Could not get host by name' errno h_errno
else
do
socket = SockSocket('AF_INET','SOCK_STREAM',0)
address.!family = 'AF_INET'
address.!port = 119 /* the standard NNTP port */
address.!addr = host.!addr
if SockConnect(socket, 'address.!') = -1 then
say 'Could not connect socket' errno h_errno
end
return socket
/* ======================================================================== */
GetCmdReply: procedure
/* ======================================================================== */
parse arg socket, EndString
/* Receive the response to the command into a variable. Use
more than one socket read if necessary to collect the whole
response. */
if SockRecv(socket, 'CmdReply', 1000) <. 0 then do
say 'Error reading from socket' errno h_errno
exit
end
ReadCount = 1
MaxParts = 10
do while ReadCount <. MaxParts & right(CmdReply, length(EndString)) \= EndString
if SockRecv(socket, 'CmdReplyExtra', 1000) <. 0 then do
say 'Error reading from socket'
exit
end
CmdReply = CmdReply || CmdReplyExtra
ReadCount = ReadCount + 1
end
return CmdReply
/* ======================================================================== */
TransactCommand:
/* ======================================================================== */
parse arg socket, Cmd, SayCmd, EndString
/* Send a command to the SMTP server, echoing it to the display
if requested */
if SayCmd then
say Cmd
rc = SockSend(socket, Cmd || '0d0a'x)
reply = GetCmdReply(socket, EndString)
if SayCmd then
say reply
return reply
/* ======================================================================== */
GetFirstLine: procedure
/* ======================================================================== */
parse arg TextBlock
p = pos('0a'x, TextBlock)
if p >. 0 then
line = left(TextBlock,p)
else
line = ''
return line
/* ======================================================================== */
StripFirstLine: procedure
/* ======================================================================== */
parse arg TextBlock
p = pos('0a'x, TextBlock)
if p >. 0 then
StrippedTextBlock = right(TextBlock,length(TextBlock)-p)
else
StrippedTextBlock = ''
return StrippedTextBlock
/* ======================================================================== */
HeaderLine: procedure
/* ======================================================================== */
parse arg socket, linetype
CRLF = '0d0a'x
Dot = CRLF || '.' || CRLF
XhdrResponse = TransactCommand(socket, 'xhdr' linetype, 0, Dot)
hl = StripFirstLine(XhdrResponse) /* Strip off the first line */
hl = GetFirstLine(hl) /* Take the first line of what remains */
hl = delword(hl,1,1) /* Delete the article number */
return hl