2 min read

Many technological innovations in data analysis and artificial intelligence (AI) bring with them an exciting vision of the future – and NTT’s Large Language Model (LLM)  is one of the most fascinating. As an LLM-based visual machine reading technology, it represents a huge leap toward bridging the gap between AI and human cognition, promising to enable immersive, secure and realistic experiences in virtual environments.

So what is this groundbreaking technology exactly? Essentially, NTT has found a new way to tackle the complexities involved in the sharing – also known as transporting – of information with AI to optimize efficiency and speed to allow for new applications.

This visual machine reading technology offers all kinds of useful applications in the real world, including the enhancement and transmittance of visual cognitive abilities. In sports, for instance, this means athletes would be able to use the technology to analyze and compare their movements with those of top performers, helping to improve their technique and performance. In the medical field, surgeons may enhance their skills by studying and visualizing the precise movements of expert peers, potentially leading to better surgical outcomes and patient care. Regardless of the use case, it would be designed and applied in ways that ensure trustworthiness, respect human rights, and protect privacy.

Bridging human expertise with AI power

The goal of this technology is to give AI systems human-like visual understanding, enabling them to interpret and analyze visual elements such as diagrams, graphs and icons with precision and context. This capability not only enhances collaboration between humans and machines but also democratizes access to advanced skills and knowledge previously limited to experts.

By simplifying complex computations and enhancing visual cognitive abilities, the technology opens new frontiers in AI-driven solutions and sets the stage for a future where humans and machines collaborate seamlessly to the benefit of society, people and the planet.

Solving complex challenges with innovative algorithms

Traditionally, the process of optimal transport has been dauntingly resource-intensive and time-consuming, limiting its practical application in large-scale datasets.

This solution leverages the concept of cyclic symmetry, where structures maintain their form under specific transformations like rotation or inversion, such as the regular patterns seen in gears or snowflakes, for example. This new approach effectively breaks down the optimal transport problem into smaller, more manageable components by identifying and exploiting inherent symmetries within real-world data. By reducing the complexity of the problem, NTT’s algorithm achieves faster, more efficient computations compared to traditional methods.

Recognition and future directions

This work was highlighted at the 38th Annual AAAI Conference on Artificial Intelligence, where NTT showcased how the algorithm outperforms traditional methods in both theoretical and experimental contexts.

NTT is committed to refining and expanding the capabilities of its technology. As a cornerstone of the Innovative Optical and Wireless Network (IOWN) initiative, visual machine reading technology has the potential to foster global connectivity and collaboration and enhance human capabilities.

Innovating the future

NTT’s breakthrough marks a pivotal moment in the evolution of AI and data science. It uses the power of large language models and innovative algorithms to open the door to new possibilities for efficient data analysis and transformative AI applications. As we continue to refine and expand the technology, the potential for enhancing human-machine collaboration and advancing global connectivity is far-reaching.