Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

My thesis project explores the role of contextual cues in vision-language models for referring expression generation, introducing the Common Objects Out-of-Context (COOCO) dataset to evaluate models’ ability to leverage context under varying scene conditions, and analyzing attention patterns to understand scene processing in multimodal models.

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

Cognitive and Human-Inspired Evaluation of Vision-Language Models in Scene Understanding

Acquiring Complex Concepts with Comparative Learning

This study explores whether Comparative Learning can enable a single multi-task Vision-Language Model (VLM) to acquire complex concepts through the logical composition of primitive notions.

Exploring the Impact of Tokenization Strategies on Machine Translation using Transformer Seq2Seq Architecture

This project examines how different tokenization strategies—character-level, word-level, and subword-level (WordPiece)—affect the translation quality of Transformer-based Neural Machine Translation models, hypothesizing that subword-level tokenization offers the best trade-off between vocabulary efficiency and performance as measured by BLEU scores.

Prototyping with Generative Agents

This project builds on two studies by Joon Sung Park and collaborators to improve the prototyping of social computing systems. The first study, ‘Social Simulacra: Creating Populated Prototypes for Social Computing Systems’ (Park et al., 2022), introduces a technique called social simulacra, which generates realistic simulations of online communities based on a designer’s input (e.g., community goals, rules, and member personas) to expose potential social dynamics—both constructive and disruptive—at scale. The second study, ‘Generative Agents: Interactive Simulacra of Human Behavior’ (Park et al., 2023), presents generative agents, an architecture built around large language models enhanced with memory and reflective reasoning, enabling agents to simulate more coherent and human-like behavior over time. This project proposes integrating generative agents into the social simulacra framework to increase the realism and interpretability of simulated interactions. By giving each agent a distinct memory, personality, and ability to reflect, the system not only better mimics plausible social behavior but also supports qualitative analysis of interactions through agents’ internal perspectives.

publications

ChatGPT’s Information Seeking Strategy: Insights from the 20-Questions Game

Published in INLG, 2023

Download Paper

COOCO – Common Objects Out-of-Context – Semantic Violation in Scenes: Investigating Multimodal Context in Referential Communication

Published in Arxiv, 2025

Download Paper

Filippo Merlo

Sitemap

Pages

Page Not Found

About Me

Archive Layout with Content

Posts by Category

Posts by Collection

CV

CV

Markdown

Page not in menu

Page Archive

Cognitive and Human-Inspired Evaluation of Vision-Language Models in Scene Understanding

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Future Blog Post

Blog Post number 4

Blog Post number 3

Blog Post number 2

Blog Post number 1

portfolio

Cognitive and Human-Inspired Evaluation of Vision-Language Models in Scene Understanding

Acquiring Complex Concepts with Comparative Learning

Exploring the Impact of Tokenization Strategies on Machine Translation using Transformer Seq2Seq Architecture

Prototyping with Generative Agents

publications

ChatGPT’s Information Seeking Strategy: Insights from the 20-Questions Game

COOCO – Common Objects Out-of-Context – Semantic Violation in Scenes: Investigating Multimodal Context in Referential Communication