Glossary of NoCeM Terms

While this does define a number of terms and their nocem usage, it is more than just a glossary. Herein is the explanation of a good part of the works-- it is worth reading through.

Remember that this is very early in the development of this project, and many things below may change.

Here are the definitions of some important terms, _in their NoCeM context_: This is not meant to be restrictive, however some of the differences in terms can be subtle-- using a standard nomenclature may help to alleviate this.

nocem: (pron "No See 'Em) A system to help people deal with the mind-numbing amount of information floating around on usenet.
EMP: (as per Chris Lewis) Excessive Multi-Posting (EMP) means the same as the term "spam" usually does, but is more accurate and self-explanatory. It means, essentially, too many separate copies of a substantively identical article.
MMF: stands for Make Money Fast, an annoying, and usually illegal pyramid scheme found posted all over Usenet. Considered fair game for cancelling by many.
PGP: PGP stands for Pretty Good Privacy - a great program written by Phil Zimmerman that allows you to use strong encryption to protect your data. It also allows you to verify the authenticity of a message via digital signatures.
notice: One nocem notice consists of a header section and a body, separated by predefined delimiters. One or more notices may be transmitted in one message, however if a notice is too big to fit in one message, it should be broken down into smaller notices-- the notice should not be split into multiple messages.
message: A nocem message is a generic term for the wrapper of a public key signed nocem notice. As a side effect of the message transport protocol, there may be extraneous text before the notice (message headers) and/or after the notice (a .signature) which will be ignored.
protocol: A method of transmission of nocem messages. Possible protocols include news, email, ftp, www, and even the filesystem of large multi user systems. However, the protocol cannot alter any text inside the signature delimiters or the signature will not match.
issuer: The person who issues a nocem notice. This may or may not be the same person who posts the message which contains it. A given issuer can be identified by his/her public key signature. If a person will be issuing notices with markedly different criteria, he/she may wish to issue under two different public keys to emphasize the difference.
header: Message headers are byproducts of the transmission protocol and should be ignored. (The exception being that stage 1 scanners can look for tags in the message header to cull that message). Nocem headers are the first half of a nocem notice. They should consist of information to help the user (or his script) gain more information about the reasons for this notice. Except for a mandatory "Version" header, they are not formally defined at this time.
body: The message body should consist of the signed nocem notice. Anything else will be discarded. The lines in the nocem notice body start with either "TAB" or one or more TABs. A line that starts with a TAB is assumed to relate to the previous msgid. After the TAB, is a list of space delimited newsgroups. These are the messages that the issuer has marked for action. (The newsgroup is included to optimize selection according to the users newsrc.)
delimiter: The nocem delimiters are "@@BEGIN NCM HEADERS", "@@BEGIN NCM BODY", and "@@END NCM BODY". (It should be self evident where these go.) Note that the ending delimiter MUST be present so that insecure message delivery protocols can be used. (Someone could forge a post and append additional message-ids which would be acted on, if the signature checking program doesn't separate the signed and non signed text...)
signature: The signed notice must be inserted into a message in asciimode, however textmode need not be used. (I recommend NOT using textmode after testing is complete. 1) this will discourage people from attempting to bypass the authentication, and 2) we'll get the benefit of compression of very repetitive information.) However, please use textmode while testing is still going on.
stage N: Note that a given stage can consist of multiple scripts, or it's possible that a number of stages can be handled in one script. This meant to be a conceptual separation -- it need not be enforced as an absolute number of scripts, however, I do like the idea of allowing the user to transparently swap in different modules. Note that stages should attempt to optimize whenever possible-- it's possible that stage 3 could receive two similar notices from two different issuers-- it should attempt to remove duplicates before processing.
stage 0: Is composed of the workings of the message transport protocols (smtp, nntp, http, etc...). At the moment I anticipate only using these as they were designed. Someday maybe we'll need ncmp...
stage 1: Is composed of scripts that scan the stage 0 protocols looking for nocem messages. Protocol dependent tags will be defined, and they may use the message headers. For example, all news messages, in newsgroup alt.nocem.misc and (Subject =~ /^@@NCM/) will be culled and fed into stage 2. (Note that news postings should NOT have a references header. Followups to notice might not trim the @@NCM and would slow things down...) This stage could also be a daemon that watches news or a mailbox for tags, or goes out and fetches a given URL. (this would be most appropriate for admins.)
stage 2: Is composed of scripts that will accept messages from stage 1, and attempt to authenticate them using the signature and a public keyring ($ncmdir/$ncmring). PGP could be used for this, although there are some problems with this eventually, we can write something better suited to this task with RSAREF. Note that stage 2 can be looked upon as two sub stages- 2a: authenticates the messages against the special keyring, and 2b: verifies structural integrity of the resulting notice-- the important task being to make sure that the Begin Header and End Notice delimiters are balanced.
stage 3: (There are three subparts to this stage because they are very likely to be part of the same script.) Stage 3 is composed of scripts that take authenticated messages from stage 2 and act on them. Stage 3a will read the nocem headers, and make a decision to continue based on the values. Stage 3b will turn each relevant body line into the information needed for action, if possible. (This may include looking up the msgid in the history dbz - optimizations are possible on large systems). Stage 3c will take the desired action, usually canceling the message (for an admin) or marking it as read in the newsrc (for a user). People will most likely also want Stage 3 that displays a random message from the body, as an easy way to check on an issuer.
tag: Message protocol (stage 0) header that identifies this message as containing a nocem notice.
chain: A script that glues together the stdout and stdin of the various stage scripts. It's quite possible that a user may want more than one chain-- either to accommodate a number of stage 0 mechanisms or to accommodate multiple action types-- one chain might cull messages from the web for cancellation, and another chain might cull messages from a best-of newsgroup, for display.

Email: moose@cm.org