Glossary of NoCeM Terms
While this does define a number of terms and their nocem usage, it is
more than just a glossary. Herein is the explanation of a good part
of the works-- it is worth reading through.
Remember that this is very early in the development of this project,
and many things below may change.
Here are the definitions of some important terms, _in their NoCeM
context_: This is not meant to be restrictive, however some of the
differences in terms can be subtle-- using a standard nomenclature may
help to alleviate this.
- nocem
- (pron "No See 'Em) A system to help people deal with the
mind-numbing amount of information floating around on usenet.
- EMP
- (as per Chris Lewis) Excessive Multi-Posting (EMP) means the same
as the term "spam" usually does, but is more accurate and
self-explanatory. It means, essentially, too many separate copies of
a substantively identical article.
- MMF
- stands for Make Money Fast, an annoying, and usually illegal
pyramid scheme found posted all over Usenet. Considered fair game for
cancelling by many.
- PGP
- PGP stands for Pretty
Good Privacy - a great program written by Phil Zimmerman that
allows you to use strong encryption to protect your data. It also
allows you to verify the authenticity of a message via digital
signatures.
- notice
- One nocem notice consists of a header section and a
body, separated by predefined delimiters. One or more notices may
be transmitted in one message, however if a notice is too big to fit
in one message, it should be broken down into smaller notices-- the
notice should not be split into multiple messages.
- message
- A nocem message is a generic term for the wrapper of a
public key signed nocem notice. As a side effect of the message
transport protocol, there may be extraneous text before the notice
(message headers) and/or after the notice (a .signature) which will be
ignored.
- protocol
- A method of transmission of nocem messages. Possible
protocols include news, email, ftp, www, and even the filesystem of
large multi user systems. However, the protocol cannot alter any
text inside the signature delimiters or the signature will not match.
- issuer
- The person who issues a nocem notice. This may or may not be
the same person who posts the message which contains it. A given
issuer can be identified by his/her public key signature. If a person
will be issuing notices with markedly different criteria, he/she may
wish to issue under two different public keys to emphasize the
difference.
- header
- Message headers are byproducts of the transmission protocol
and should be ignored. (The exception being that stage 1 scanners
can look for tags in the message header to cull that message).
Nocem headers are the first half of a nocem notice. They should
consist of information to help the user (or his script) gain more
information about the reasons for this notice. Except for a mandatory
"Version" header, they are not formally defined at this time.
- body
- The message body should consist of the signed nocem notice.
Anything else will be discarded. The lines in the nocem notice body
start with either "TAB" or one or more TABs. A line that
starts with a TAB is assumed to relate to the previous msgid. After
the TAB, is a list of space delimited newsgroups. These are the
messages that the issuer has marked for action. (The newsgroup is
included to optimize selection according to the users newsrc.)
- delimiter
- The nocem delimiters are "@@BEGIN NCM HEADERS", "@@BEGIN
NCM BODY", and "@@END NCM BODY". (It should be self evident where
these go.) Note that the ending delimiter MUST be present so that
insecure message delivery protocols can be used. (Someone could forge
a post and append additional message-ids which would be acted on, if
the signature checking program doesn't separate the signed and non
signed text...)
- signature
- The signed notice must be inserted into a message in
asciimode, however textmode need not be used. (I recommend NOT using
textmode after testing is complete. 1) this will discourage people
from attempting to bypass the authentication, and 2) we'll get the
benefit of compression of very repetitive information.) However,
please use textmode while testing is still going on.
- stage N
- Note that a given stage can consist of multiple scripts, or
it's possible that a number of stages can be handled in one script.
This meant to be a conceptual separation -- it need not be enforced as
an absolute number of scripts, however, I do like the idea of allowing
the user to transparently swap in different modules. Note that stages
should attempt to optimize whenever possible-- it's possible that
stage 3 could receive two similar notices from two different issuers--
it should attempt to remove duplicates before processing.
- stage 0
- Is composed of the workings of the message transport
protocols (smtp, nntp, http, etc...). At the moment I anticipate only
using these as they were designed. Someday maybe we'll need ncmp...
- stage 1
- Is composed of scripts that scan the stage 0 protocols
looking for nocem messages. Protocol dependent tags will be defined,
and they may use the message headers. For example, all news messages,
in newsgroup alt.nocem.misc and (Subject =~ /^@@NCM/) will be culled
and fed into stage 2. (Note that news postings should NOT have a
references header. Followups to notice might not trim the @@NCM and
would slow things down...) This stage could also be a daemon that
watches news or a mailbox for tags, or goes out and fetches a given
URL. (this would be most appropriate for admins.)
- stage 2
- Is composed of scripts that will accept messages from stage
1, and attempt to authenticate them using the signature and a public
keyring ($ncmdir/$ncmring). PGP could be used for this, although
there are some problems with this eventually, we can write something
better suited to this task with RSAREF. Note that stage 2 can be
looked upon as two sub stages- 2a: authenticates the messages against
the special keyring, and 2b: verifies structural integrity of the
resulting notice-- the important task being to make sure that the
Begin Header and End Notice delimiters are balanced.
- stage 3
- (There are three subparts to this stage because they are
very likely to be part of the same script.) Stage 3 is composed of
scripts that take authenticated messages from stage 2 and act on them.
Stage 3a will read the nocem headers, and make a decision to continue
based on the values. Stage 3b will turn each relevant body line into
the information needed for action, if possible. (This may include
looking up the msgid in the history dbz - optimizations are possible
on large systems). Stage 3c will take the desired action, usually
canceling the message (for an admin) or marking it as read in the
newsrc (for a user). People will most likely also want Stage 3 that
displays a random message from the body, as an easy way to check on an
issuer.
- tag
- Message protocol (stage 0) header that identifies this message
as containing a nocem notice.
- chain
- A script that glues together the stdout and stdin of the
various stage scripts. It's quite possible that a user may want more
than one chain-- either to accommodate a number of stage 0 mechanisms
or to accommodate multiple action types-- one chain might cull messages
from the web for cancellation, and another chain might cull messages
from a best-of newsgroup, for display.