David Eastman
Engaging with the theoretical mathematics that underpins Artificial Intelligence (AI) can facilitate a smoother transition towards acquiring the skills employed by AI developers, fostering a deeper understanding of the intricate mechanisms at play behind the scenes.
The realm of AI abounds in mathematical concepts and associated terminology, primarily inclined towards the conceptual rather than the algebraic domain. While the primary objective is not to delve into intricate details, obtaining a peripheral insight can significantly enhance comprehension when navigating technical white papers.
Andrei Andreevich Markov, an eminent Russian mathematician and skilled chess player, laid the groundwork for contemporary computing through his seminal contributions to processes and probability theory, which remain pivotal in the field of AI.
Fundamentally, any process can be distilled into distinct states and transitions. This abstraction not only resonates well with computational systems but also mirrors how humans naturally construct narratives. Rather than detailing events in a real-time sequence, the focus tends to be on pivotal moments. For example, consider John’s trip to the shops—it unfolds as a coherent sequence of events, devoid of explicit time markers.
John’s activities can be classified as follows:
- Traveling (to and from the shops)
- Shopping (buying bread or a sandwich)
- Chatting (participating in conversations)
These activities can be depicted through transitions such as:
- Moving from home to the shops and back
- Transitioning between different shops
- Shifting from shopping to chatting and back to shopping
By delineating these transition zones, we encapsulate John’s routine movements. To an external observer, like a curious neighbor, John’s apparently random journeys, though limited in options, form a stochastic process.
Returning to John at home, a Markov chain, as defined by Wikipedia, embodies a stochastic model describing a sequence of potential events where each event’s likelihood depends solely on the preceding state.
Essentially, the future event in a Markov chain relies solely on the current state, reflecting the apparent randomness observed by an external observer like the nosy neighbor. The mathematical model does not aim to decipher intent but rather serves as a predictive platform.
Moving ahead, the use of a transition matrix assists in structuring this model, with probabilities represented as decimals between 0 and 1. This square or n-by-n matrix aligns the current state with the subsequent state, ensuring that the total probability for each state sums up to 1.
Markov chains prove to be invaluable in scenarios where discrete states are involved, yet a comprehensive understanding of the system is lacking.
In the realm of AI, Markov chains form the basis for predictive text generation. As the model accumulates more textual data, updated statistics are integrated into the Markov chain, refining the predictive capabilities.
The application of Markov models, especially in predictive text for the English language, involves analyzing the likelihood of each letter based on the preceding letters. By utilizing k-grams, the model predicts letter occurrences, enhancing predictive accuracy.
In contexts like sentence completion in search engines, the corpus comprises global search terms, enabling the system to adapt to various linguistic nuances, including misspellings.
By grasping these fundamental concepts and delving into the mathematical intricacies, one can unravel the mysterious aspects of AI evolution, paving the way for a more informed journey into future AI advancements.
Subscribe to our YouTube channel to stay updated with the latest in tech. Don’t miss out on our podcasts, interviews, demos, and more.
Group
Created with Sketch.
David Eastman brings a wealth of experience as a professional software developer in London, with engagements at Oracle Corp. and British Telecom. His consultancy work focuses on fostering agile team dynamics, complemented by his authorship of a UI design book and a series of technical articles.