This is a project focused on Voynich manuscript analysis and object/characters detection. Please share it with friends... !

Research

What if we were the first ones to read a book that no one has ever read before? To decode what has been hidden for decades?

Can we find answers together?

A lot of research has been done to try to understand even a small section of the Voynich Manuscript, but none has given us exact answers. A lot of very interesting question appeared... Is the text opened or encrypted?, Do characters of the manuscript belong to some language family or an old ancient language?, Did someone see objects in the book in the real world?.
The book is full of medieval symbolism and astro-alchemical symbols and much more. We are definitely not the first to try to succeed in this difficult task, but we believe that if we put together the knowledge of people around the world, we might get at least closer!

How this project started

This project began on a university in the Czech Republic - VŠB Technical University of Ostrava. We found the manuscript extremely interesting and thought that our technical point of view might help and bring new ideas and insight. The possibility that this manuscript was once held in Prague, our capital city, brings our attention to the book even more...
We collaborated with other universities from our homeland and also with foreign ones, but now we want the whole world to work with us. Our activity and dissemination are not only based on research publications but also in TV documentaries we participated in (2x Czech TV).

Analysis of the language

Just a fraction taste of our research...
We have analyzed the texts and based on the structure of the language, we think that it's not gibberish, but an actual language or dialect. We took the Bible book in English and computer translation of the Voynich manuscript and ran some experiments on these two books to compare the languages. The picture below is a visualisation of the most used words in the manuscript and the Bible. In both pictures, the 'spread' of the words seems to be very similar. For more see

Another experiment is based on a complex network and its representation. Again we took the mysterious text and compared its visualised network with randomly generated text. The conversion of the text is done in a simple manner - the words in the text are taken as the graph nodes and its ordering as the edges of the graph. For example, the text The Voynich manuscript is an illustrated codex hand-written in an unknown writing system would be converted to The→Voynich→manuscript→is→an→illustrated→codex→hand-written→in→an→unknown→writing→system. Such visualization can capture important text features and structure and also detect words that occur most commonly, in which order, etc. For the Bible or any other common text in English, the most used word would be 'the', which is used like a glue that binds content words together in sentences. Any logical text also follows certain grammatical rules and some repeated patterns must be observable - all of this information can be found in the complex network of the analyzed text. Based on our experiments we believe that the manuscript fulfils all of the above and is written in a real, may be encrypted, language.

Analysis of the alphabet

Also, the alphabet itself was studied and used as a subject of various experiments.

Similarity between VM characters and old Indian Khoji dialect, one of many we tried...

Another method studied the importance of each letter in the alphabet and its greatest similarity with other letters. Why? Well, each language is under evolution formed by social, cultural and economical processes. All common languages had its predecessors i.e. its contemporary structure and character form is kind of mutation from previous versions from previous centuries. If it is there, then, theoretically, we can capture that by special algorithms and that can likely lead us to the origin of a given language (compare Indian (so-called Arabic - thanks to Fibonacci) numbers 0-9 with nowadays form). So in a very simple way we measured the similarity between characters of VM and some old Indian dialects. And visualized it.

The first graph has vertices different in size - each belong to a single letter. The colour and thickness/size shows the importance of the letter of this alphabet in the text and its greatest link similarity with the other letters. The second picture is an alternative representation of the first one. The size of the vertex corresponds with the importance of the letter and the colour represents similarity and the strength of the relation.

We also tried to visualise VM characters as the "communities" in the complex networks. And "voilá" this method is able to group VM characters according to its visual similarity. What if we use it together with another language/dialect set?...

...a dendrogram visualization has been used too. This, very simply said, shows how objects belong to some groups or families of other objects, in this case - characters. We can see that Koji is somehow included in VM. Or if you like, the highest characters are of the most complex structure and those on lower levels are something like its derivation/mutation/simplification.

Topics

As you can see a lot of experiments and analysis have been done in order to understand the manuscript better. And we are just one of the many. Even though the language has not been decoded yet, we can try to identify the topics that are discussed in the book. It is believed that there are four main topics:

  • Botanical part. The illustrations contain pictures of common-looking European plants. However, most of them are difficult to identify, some have not been identified to date.
  • Astronomical part. Illustrations include astronomical diagrams, astrological diagrams and symbols, sketches of the zodiac. Some diagrams are followed by the names of the months of the year in Latin, in the Romance language, perhaps in Catalan or Occitan. However, it is possible that the passages written in Latin were added later.
  • Biological part. The illustrations mostly show miniatures of naked women bathing in some strange formations, some of which resemble bodily organs. Some of the women have crowns.
  • Cosmological part. The illustrations are remotely reminiscent of maps of a strange landscape or cosmological sketches. Castles, perhaps volcanoes, are shown.
  • Pharmacological part. Illustrations show parts of plants (roots, leaves). Perhaps these are pharmaceutical prescriptions, which may be indicated by the fact that the text is divided into short paragraphs in this section.
  • Recipes (?). Without illustrations, the text is divided into short paragraphs, separated by bullets in the shape of a flower or star.

But is it everything? Our friends from Memphis university (USA) have used a special math method to try to reveal topics that are discussed in VM. Well, it is only math estimation, but it seems like there are more topics than just those few...

X axe represent officially recognised topics inside VM, colored spots those recognised by math analysis...

We do not claim that our research is the best most precise or the most complete one, however, we try to look at this problem in an unconventional way. As you can see, there is a lot of work behind the scenes of this project, but now we need to know your ideas and opinions to help us get closer to understanding that mystic manuscript. So or so, math and computers cannot reveal everything. A human mind, full of creativity, intuition and knowledge is very powerful and no algorithm can be compared with. Especially when more minds cooperate. Now is the right time 😃

BE THE FIRST AND JOIN US IN THIS FIRST GLOBAL PLANETARY CROWDSOURCING EXPERIMENT and share your knowledge with us!