Skip to Content, Navigation, or Footer.
Support independent student journalism. Support independent student journalism. Support independent student journalism.
The Dartmouth
May 5, 2024 | Latest Issue
The Dartmouth

Torresani enhances search engine with image analysis

As sophisticated as search engines have become, the Googles, Baidus and Bings of the world still rely on text-based analysis. But a new project developed by computer science professor Lorenzo Torresani seeks to improve search engines by integrating image-based analysis of web page results.

The project, now two years in development, enhances performance by drawing information from images in websites to find more relevant results.

"I had this idea of trying to use the images inside web pages to improve the accuracy of document search," Torresani said. "This originated from observing the synergy between text and images."

Torresani's work focuses on computer vision, in which researchers build systems that recognize items in photos by using the relationship between text and pictures. The image-based search engine project essentially reverses this process, Torresani said.

"We are saying that pictures also provide information about the text," he said.

After conceiving the idea, Torresani built a working prototype and recruited Sergio Vaamonde, a PhD student from the University of the Basque Country in Spain, as a visiting research fellow at Dartmouth.

The image-based search engine conducts a classical text-based search and looks at the images in the resulting web pages. It then analyzes the pictures based on information contained in the pixels of each image and reorders the search results according to their visual relevance.

"We use some artificial intelligence techniques to get a mathematical representation of the image," Vaamonde said. "In the end, we have a set of numbers that describe the content of the image."

To recognize image content, the search engine uses a "simple but clever trick" that exploits the capabilities of currently available image search engines, such as Google Images, to build a model for the type of image most relevant to the search query, Torresani said.

"We do this in real time, and now we can use that visual model to check whether web pages contain pictures of the search query," he said.

The new tool relies on a traditional text-based search engine that performed best on established industry benchmarks.

Adding image-based analysis to the original search engine considerably improves its accuracy. Torresani said that the results indicated that the image-based search engine captured complementary information.

"I think the idea is very fresh, and I think we now have a clear demonstration that it's working and giving very good results," he said.

While the current solution implemented by the project is "very simple," Torresani plans to develop it over the next few years with other PhD students.

"I'll certainly have students working on this in the next few years," he said.

Torresani and Vaamonde will present their project at the annual Special Interest Group on Information Retrieval conference in Dublin this summer. The conference, which spotlights new research and development in the field of information retrieval, is co-sponsored by Internet giants such as Google, Baidu and Microsoft.

"We are going to show our work to the industry and see what they say," Vaamonde said. "I spent so many hours at Dartmouth working on it, and now I want to present it to the rest of the world."

In addition to informing industry professionals about the project, Torresani will use the conference to collect feedback.

Torresani said that he believes the project has potential, and his dream is to use the idea as the foundation for a startup.

"This [project] was accepted into the most prestigious information retrieval conference and I think it will make a big splash in the search engine world," Torresani said. "We're very excited."