Spelling is a common feature of today’s software that we expect to see it in browsers and basic text editors, and in every computing device. However, 50 years ago, implementing it due to lack of memory was an important challenge that also affected the supercomputer. The solution was so effective that the basic technology developed – a flexible compression algorithm – is still in use today.
Okay, so the five -decade computing story is hardly ‘news’ but thanks to the timely reminder of this work HectaWe realized that it is worth sharing and its L, we have to turn to it Blog of Abhino Apadhya On the topic
All of this began in 1975 when programmers at AT&T wanted to impose the company’s Patent Division as a text processor. Operating system is not a routine choice for a word processor but hey, then matters were different at that time. One of the obstacles to using Unix in such a role was that it would need a fully active spelling checking system.
The easiest way to do this is that a whole dictionary is throwing the system in memory and then just to find words to the computer. However, not only would it be very slow to do it, but the computers at that time did not have the memory to store the dictionary. As the Apadhya starts in its blog: How does a 250 KB file fit in just 64 KB of RAM?
Of course you compress it, but it’s not as easy as it seems. For example, I can create a 256 KB text file, which contains more than 260,000 alphabetical characters, and can use file compression tools and lamps – Zuu -Marcov China algorithm (LMZA) only to squash up to 257. BytesBut it’s on a modern computer, which has a very fast, multi -core processor and a large amount of high -speed ram. And this is a useless dictionary.
Were using AT&T PDP-1 machines Which are light years away in terms of processing power and memory. Finding this age machine through such a compressed file would be unusable in real terms and my squash text file is good for just one word.
Spell -checker’s first prototype for Unix made a better part to make computer scientist Steve Johnson and when it works, it was slow (because it was a disk) and was a victim of mistakes. Douglas McLearvi, in the lure of unoxications in the Bell Laboratories, then picked up the gontel and confronted the two fronts from this problem: An algorithm and a data structure for the dictionary to reduce the size of the words in the dictionary so that it fits a few KB of system memory.
The updated both goes through all mathematics behind both mechanisms and this is interesting reading, though it requires a full foundation in mathematics and computer science so that there is a possibility of understanding.
The important thing is the final result. As a result of the work of Macrooi, eventually results in an algorithm that requires an amazing 14 bits memory for each word. A dictionary with 30,000 entries will participate under 52 KB. The minimum size with the system is 13.57 bits (the use of maximum memory was faster to find words) so it is worth saying that McLearwi did incredible work. The level of usable compression is unparalleled to this day, in the sense of how close it is to its ideological extent.
A portion of his solution was involved in its use Golbumb codingA form of unlimited compression that is still in use today, in the form of coding of rice, such as Flac, apple damage, and used in infinite JPEG. Modern spelling checkers certainly work very differently, and they are certainly not under the same processing and ram obstacles that were Johnson and McLouri – for example, only 32 MB memory requires themselves and thus RAM can only be dreaming in the 1970s.
This is proof, if anyone ever needed, the most creative programming solutions are created under the hobby of hardware barriers. Today’s computers are so capable, with such a wide range of resources, that many solutions are in ‘these just work’ – while I’m not suggesting that we should return to the world of KHZ clock speed and merciful memory, I wish the software companies get a leaf from the early programmers.