July 8, 2025

By Andrea Passerini, Professor at DISI (University of Trento)

Artificial intelligence is increasingly shaping our daily lives and is often involved in important decisions, from granting a loan to hiring a new employee, to identifying the best route for an airplane. But can we really trust the algorithms? In this interview, Andrea Passerini, Full Professor at the Department of Information Engineering and Computer Science (DISI) of the University of Trento, discussed the challenges and opportunities of creating AI systems we can genuinely trust. From the technical limitations of deep neural networks to the social implications of automated decision-making, Passerini offers valuable insights on how we might build AI that empowers rather than replaces human capabilities.

Hi Andrea, thanks for contributing to Grow Digital Insights and sharing your expertise on this topic with us. Let’s start with the basics. What do we mean when we talk about Trustworthy AI?

When we talk about Trustworthy AI, we're referring to artificial intelligence based on predictive models - in particular, machine learning models - which, by their nature, don't offer absolute guarantees of correctness. These are systems that can make predictions that are more or less accurate, depending on how well they can correctly interpret the features of the input to be processed.

The most recent and widespread models, such as Deep Neural Networks, are responsible for the great enthusiasm around AI over the last fifteen years. They're known for their high accuracy on standardized benchmarks, but at the same time, they are difficult to interpret: it's not easy to understand why they make a certain prediction. This creates problems both in terms of transparency and reliability, especially because these technologies are increasingly integrated into our everyday devices and software.

This is where the concept of "trustworthiness" and "explainability" comes from. With deep models, which include thousands of layers and complex internal processes, it's not realistic to expect a transparent explanation. However, it is possible to identify the components of the input that most influenced a decision, for example using so-called saliency or heat maps, which show which parts of an image contributed to a certain classification.

Can you give a concrete example of this problem?

Of course. A well-known example is that of a system that needs to recognize wolves in images. The model seems to work well, but in reality, it has learned to identify snow as the main indicator of the presence of a wolf, because in the dataset the wolf always appears in snowy environments, while the dog does not. So, if you show the model a dog in a snowy landscape, it risks incorrectly classifying it as a wolf. This is a case of "right for the wrong reason": the prediction is correct in the context of the training set, but based on a wrong correlation.

Is it possible, therefore, to design truly reliable, "trustworthy" systems?

It's still an open challenge. There are various strategies and visions. One thesis - widespread especially with Large Language Models - argues that if the model has seen a sufficient amount of data, it will rarely encounter "out of distribution" cases, that is, situations too different from those seen during training. But this hypothesis is problematic: the fact that the model has seen a lot doesn't guarantee that it will behave correctly in every context.

Certainly, more data can improve accuracy, but they don't guarantee reliability. This is why complementary approaches are being studied, such as formal verification, which serves to ensure that under certain conditions the output of the network falls within a specific range. However, these techniques are currently applicable only to small models because they have a high computational cost. Another strategy is to integrate predictive models with symbolic components - based on explicit knowledge and logic - that allow more formal and interpretable control. For example, you can use a predictive system to recognize road signs and pedestrians, but then apply logical rules to decide how to behave, respecting the rules of the highway code.

Can you give some more details on the formal verification approach? How does it work?

Formal verification analyzes how the neural network processes inputs, evaluating whether the output falls within a certain range with respect to variations in the input data. However, as we were saying, this is only feasible on networks with few nodes, whereas its too onerous for models with thousands of layers. But in "safety-critical" contexts, such as aviation, smaller predictive models are often used, precisely because they are verifiable.

Does the concept of trustworthy AI concern only technical correctness, or also social aspects such as bias?

Both. In critical contexts, you can't afford errors: you must have verifiable models with minimized risk of error. But if we're talking about LLMs that interact with people, it's also important to prevent bias, avoid hallucinations, and provide clear explanations that help the user evaluate trust in the response.

A serious problem is that these models know how to be very convincing. They can provide seemingly solid explanations even when the output is wrong, inducing the user to believe them. This is why it is said, jokingly but not too much, that to know if you can trust it, you must already know the answer. Fortunately, in areas such as medical diagnosis or code writing, an expert can verify the output faster than starting from scratch, so the added value is real.

Are there objective systems to evaluate whether one model is more reliable than another?

There are updated benchmarks that test performance on various tasks. But these indicators are not enough. Trust also depends on how capable the user is of understanding when to trust. There is a phenomenon called automation bias, that is, the tendency to blindly trust the machine, especially if you are not an expert. This can lead to suspending your critical judgment.

Is it a risk for critical thinking, then?

Yes, there is a fear of atrophy of critical sense if we delegate too much to machines. This is why, in the European project TANGO, which I’m coordinating, we address the issue by trying to build systems that empower human capabilities, instead of replacing them. AI should be a support, an ally, not a substitute.

How does TANGO plan to achieve this?

In our approach, also shared by the European strategy on AI, the principle of "takes two to tango" applies: collaboration between person and machine is needed. The goal is to make better decisions together, leveraging each other's strengths and compensating for respective weaknesses.

To achieve this, TANGO develops AI systems that co-evolve with human beings to support complex decisions. We work on interactive learning technologies, explainability, neurosymbolic systems, collaborative decision making. We have case studies in the medical field (perinatal path, surgical decisions), banking (algorithmic recourse for mortgages), and policymaking (reducing gender inequalities in academia). The goal is to create tools that really improve decisions, together with people.

About Andrea Passerini

Andrea Passerini is Full Professor at the Department of Information Engineering and Computer Science (DISI) of the University of Trento, where he coordinates the research program on Deep and Structured Machine Learning and head of the Structured Machine Learning group (SML). His main research interests are at the intersection of learning and reasoning, and include structured machine learning, neuro-symbolic integration, explainable and interactive machine learning, preference elicitation and learning with constraints.

Write new comment

Comments Terms and Guidelines

Welcome to our comments section! We encourage open discussion and look forward to your thoughts and contributions. To maintain a respectful and engaging community, please adhere to the following guidelines:

Be respectful: Treat all commenters with respect. Avoid insults, personal attacks, or disparaging remarks about individuals or groups. Respect others’ opinions and consider your impact on the community.
Stay on topic: Keep your comments relevant to the subject of the article or post. Off-topic discussions can detract from the purpose of the forum and may be removed.
No spam or advertising: Please do not post spam or advertisements. This includes unsolicited promotions, links to external sites, or repetitive posts.
Avoid offensive content: Do not post comments that are discriminatory, racist, sexually explicit, or violent. Always consider the diverse audience that might be reading your comments.
Use polite language: Avoid using offensive or vulgar language. Comments should be suitable for a public forum with a wide-ranging audience.

By participating in our comments section, you agree to follow these rules and contribute positively to our community. Comments that fail to adhere to these guidelines may be moderated or removed, and repeat offenders may be banned from commenting.

Thank you for helping us create a friendly and inclusive space for discussion!

Comments (0)

No comments found!

Sign up for updates

Receive the latest news and events updates by subscribing to our newsletter.

For media contacts

Are you a member of the media and would you like to contact us?
→ Get in touch with us here

Beyond the black box: creating truly Trustworthy AI in the deep learning era