Algebraic Machine Learning

Algebraic Machine Learning is a new Artificial Intelligence paradigm that combines user-defined symbols with self-generated symbols that permit AML to learn from the data and adapt to the world like neural networks do, combined with the power for explainability of Symbolic AI.

AML is a purely symbolic approach and neither uses neurons nor is a neuro-symbolic method. Algebraic Machine Learning does not use parameters and it does not rely on fitting, regression, backtracking, constraint satisfiability, logical rules, production rules or error minimization.

Our Vision

We build more robust and transparent learning systems minimizing the well-known limitations of other existing approaches such as statistical biases, catastrophic forgetting and shallow learning.

Instead of "fixing" any of the existing approaches we have developed a new AI formalism inspired by Model Theory, a mathematical discipline that combines Abstract Algebra with Mathematical Logic. We felt that a new formalism was needed if we wanted to ensure transparency, control and safety from first principles. These characteristics allow for a better cooperation between humans and machines with the constraints and safeguards one may want to their interaction. We believe this formalism can be a basis for a more “humane” AI.

The company

We are a startup based in Madrid working in a close collaboration with the Champalimaud Research Foundation in Lisbon. We are part of a European consortium of companies and research institutions investigating AML, including the leading centers in artificial intelligence DFKI and INRIA.

Our team has more than two decades of experience building succesful machine-learning based products such as DeNovoX peptide de novo sequencing software or idTracker animal tracking software.


Semantic Embedding in Semilattices 

Martin-Maroto, F., & de Polavieja, G. (2022). Semantic Embedding in Semilattices. arXiv:2205.12618.

Here we give a formal definition of a semantic embedding in a semilattice which can be used to resolve machine learning and classic computer science problems. Specifically, a semantic embedding of a problem is here an encoding of the problem as sentences in an algebraic theory that extends the theory of semilattices. We use the recently introduced formalism of finite atomized semilattices to study the properties of the embeddings and their finite models. We give examples of semantic embeddings that can be used to find solutions for the N-Queen's completion, the Sudoku, and the Hamiltonian Path problems.

Finite Atomized Semilattices 

Martin-Maroto, F., & de Polavieja, G. (2021). Finite Atomized Semilattices. arXiv:2102.08050.

In this work we present a formal mathematical description of finite atomized semilattices, an algebraic construction we used to define and embed models in Algebraic Machine Learning (AML). Among others, concepts such as the full crossing operator or pinning terms, that play an important role in AML, are formalised.

Algebraic Machine Learning 

Martin-Maroto, F., & de Polavieja, G. (2018). Algebraic Machine Learning. arXiv:1803.05252.

This is the foundation of Algebraic Machine Learning (AML) and where the main concepts of the methodology are introduced. As an alternative to statistical learning, AML offers advantages in combining bottom-up (data) and top-down (pre-existing knowledge) information, and large-scale parallelization.

In AML, learning and generalization are parameter-free, fully discrete and without function minimization. We introduce this method using a simple problem that is solved step by step. In addition, two more problems, hand-written character recognition (MNIST) and the Queens Completion problem, are explored as examples of supervised and unsupervised learning, respectively.

Software and Patents

Method for large-scale distributed machine learning using formal knowledge and training data

The method consisting of independently calculating discrete algebraic models of the input data in one or many computing devices, and in asynchronously sharing components of the algebraic models among the computing devices without constraints on when or on how many times the sharing needs to happen. Each computing device improves its algebraic model every time it receives new input data or the sharing from other computing devices, thereby providing a solution to the scaling-up problem of machine learning systems.

Some drawings from the patent