Artificial Intelligence - Markov Chains

What is a Markov Chain?

A Markov chain is a type of basic artificial intelligence used to predict a succession of things. It uses objects, which correspond to different percentile values. These percentile values are then used to determine the next object in the chain.
The most common use for Markov Chains is to formulate sentences. A Markov chain works by starting with a single word. Then the percentage of chance that the word will follow up by another word. There are many different paths that a Markov Chain could take. It could start with the word “The” and be followed up by Cat, or Dog. The chain continues to extrapolate until it is decided that the chain should end. This can lead to sentences coming out roughly grammatically correct. Often times the result is funny, and if you have any imagination you can see the response to what you said as a response a child may give.
So how do we get the words and their corresponding percentile values? First we to take input. The Markov Chain needs to start learning the English language. So you give the Markov Chain a number of grammatically correct sentences to read. The more the better. As the software analyzes the sentences it calculates how often each word is followed up with the next word in the sentence and puts it into a percentage value. When the Markov Chain runs you get the output of stringing the words together.

How do I implement a Markov Chain?

To implement a Markov Chain you need to start by taking input from various sentences. Break each sentence into the words, and then add those words to a database table. My suggestion is to use Microsoft SQL Express or MySQL. The rows of the table should contain every word that has been read. The columns of the table should contain mainword, word[0], word[1], word[2], and so on an so forth. I suggest implementing at the very least word[20], if not more.

Markov Chain Table

In a second table you will want to collect the use count for each word. This table corresponds to the first table. As each sentence is read add 1 to the use count for each word used. I generally advise to calculate the percentage chance for each word during run time rather than changing all the values in the database each time a word is updated. This will allow for faster processing time in your application.

Markov Chain Table Markov Chain Table

Restrictions of a Markov Chain

Markov chains learn from examples you feed it, I suggest starting with very simple sentences. Things like “The dog goes on a walk”. Teach your program a few hundred of these simple sentences and it will begin to learn. When you finally get your first output from the Markov Chain. I will be readable and if you are lucky it will make some sense. You may find that your Markov chain can continue on forever and generate a very long string of words. My suggestion is to implement a way to terminate the sentence. When reading your sentences take note of terminating words, this can assist you in knowing when a sentence should end. Ideally you will want to get sentences that are similar to your input of basic sentences, but this can be tricky as sentences may terminate on 2 words where the second word is one that finishes the sentence. Run on sentences are pretty normal for your first Markov chain. The output of your markov chain will start to form sentences that appear grammatically correct for the most part, but you will find some oddities the in the sentences your program generates. As said before this usually come across as funny and child like or what you might expect someone who has english as a second language to sound like. These are typically the the restrictions that you might find in a basic implementation of your first markov chain.

           "You a rum and Coke"
           "A tough time for television"
           "A way to check for television"
           "You are very boring at the value of the bee, will continue my coffee"

Improving Markov Chains

There are a number of improvements that we can make to a markov chain. First it is usually helpful to have the reply of a markov chain be related to your input. I suggest making a database table and fill that table with nouns. (Person, Place or thing). In your code identify the nouns that the person has entered as input and formulate a reply containing the noun. Because the noun is the subject of a sentence this will give the markov chain a more meaningful reply as it will appear to talk about what you just said. Further more, you could use other parts of sentences, for example: verbs like the word large in “Large Cow” to talk about big cows as opposed to smaller ones. This helps give the reply a more human touch. Since there are tons of words in the English language you may find it difficult to generate your own list words and if they are Nouns, verbs, adjectives etc. In this case I suggest that you do a quick google search for dictionary text files and parse these into your database. While these are little things that can help make your replies better, you will find that they are still lacking a certain correctness to them. My solution to this problem is not a markov chain of words, but rather a markov chain of sentences. This will take extremely much more training data then a markov chain of words simply because there are vast numbers of sentences that can be created for this. If deciding to go down the route of a markov chain of sentences you will want to take one of two routes to obtain your training data. Either have your program only talk about a specific topic, feeding it hundreds of sentences about one topic in particular, or find a way to feed thousands if not hundreds of thousands of sentences. For the latter it does not make a lot of sense to generate these sentences by hand. My only suggestion is to find vasts amounts of conversation transcriptions or possible hook your program to a lively chat room or IRC channel and have your program feed off that for an extended length of time. I think if you choose you go down the markov chain of sentences route you will find your sentences and replies will make a lot more sense then a markov chain of words. Again this will give it a very human feel to the replies.

Thanks for the read and enjoi!


                                                                                                     ©2017 All Rights Reserved. All contents of this site are copyright to