The designers of the file format over the wire knew that it would not be perfect at their first attempt. The first decision they took was that the transmitted file’s first letter would be “A” for this version.
Why were email style headers not used in the beginning?
Many people might ask that why email-style headers were not used initially, though it was later used for HTTP. The key reason for not doing so was that many of them did not have any exposure to such protocols during that time. The author admits that even he got to know about Internet Protocol after receiving a copy of a workbook two years later. It was because of the USENET that he was aware of such protocols.
The designers instead chose the minimalist style, which was influenced by the seventh edition of Unix. Had they been aware of the Internet that was known as ARPANET those days, they would have avoided it deliberately. A shell script was the first version of their code. They felt it was easier to deal with complete lines as single entities. Also, continuation lines, optional white space, and parse headers, which enabled arbitrary case was definitely simpler.
Issue of duplicate articles
They also had the issue of how to handle duplicate articles. The designers felt that an article ID was an absolute necessity so that duplicate detection would be allowed. At that time, they decided to have the article ID as the remaining part of the 1st line after the letter A.
The designers also wished to minimize the costs for transfers. At that time, article transmissions were carried out by costly, dial-up connections. So, transmitting a file that was not required also required them to spend a lot of bucks. As such, articles then had to have a series of systems to indicate that the article was already seen.
This information comprised a string of hostnames and exclamation points that separated them. The last element was the user’s login name who posted it.
A pertinent question could be why that particular format was chosen rather than something like blanks or commas as separators. The format chosen by the designers was used by UUCP for email.
Today, the scenario has changed entirely as there is full connectivity over the web. Things are no longer done in the same way. Rather, a party would transmit a series of article IDs. The party would then ask for the ones that have not been seen.
It is interesting to note that the designers had contemplated something of that sort but then decided to reject it. After all, they were using dial-up connections that were infrequent to relay articles. Alternatively, the count of loops and as such, duplicate articles received did not appear to be high.
In the original scheme of plans, the Duke would poll several sites once per night. If Duke sent a list of articles to the sites during that call, they were not allowed to request for it until the succeeding night. Also, they would not get those articles until the following night.
However, such a delay was not acceptable. The designers instead decided to have the possibility of transmitting unnecessary text. There would be some additional transmissions on certain occasions. However, it was felt that such a volume would be acceptable. It was an era before MP3 and JPG formats. So, we are talking about only text articles. Thus, these would be relatively tiny and inexpensive as well.
It was obvious that the article’s title and date would also have to be sent. The library routines asctime() and ctime() were used to generate the time and dateline. The designers had made up their minds from the start that there was a requirement to have articles in multiple categories i.e. newsgroups. However, there was just a single relayed newsgroup called NET in the original design. There were no differences between various types of non-local articles.
Why was there cross-posting of articles?
Finally, there was one more interesting thing to note. They were aware from the beginning that some articles could be a part of multiple categories. Hence, they supported the cross-posting of articles in different newsgroups from the start. Although some people considered cross-posting to be impolite, the feature was intentionally included from the beginning.