Birdsong and Natural Language Group

uzh-wcms-publications.subpageListDialog.labelUnterseiten

We are interested in the neurobiological algorithms associated with vocal learning, especially of birdsong. To research birdsong learning experimentally, we longitudinally observe groups of zebra finches using custom multimodal recording arenas encompassing microphones, video cameras and wireless animal-borne sensor devices. To process the enormous datasets from months-long recordings of large animal groups, we design deep-network based approach in collaboration with the Swiss Data Science Center. We examine the inferred song learning strategies using reinforcement learning theory, which is a modeling framework that is well matched to the organization of the avian brain. Overall, the premise for our work is that animal behavior provides clues about natural intelligence, i.e., the algorithms for solving a problem.

The analysis of animal vocalizations currently relies, directly or indirectly, on human annotation of vocal segments. This processing step is the cornerstone of vocal communication research, to tease apart the vocal signal from noise is a prerequisite of any scientific insight on the structure and meaning of vocal signals. Human judgment still forms the gold standard for distinguishing vocal activity of an individual from the many other sounds in animals’ habitats, since it is virtually impossible to solve this important task without invasive measurements.

To promote vocal annotation efforts, we aggregate datasets of animal vocalizations into a massive database of vocal signals. This effort is directed by the NCCR Evolving Language. Our VocallBase https://vocallbase.evolvinglanguage.ch/ extends across 10,000 species and contains vocalizations that are carefully annotated with the help of domain experts. By adhering to the most stringent annotation standards, we ensure that comparative research is not biased by task variability and cultural differences across research fields.

We plan to integrate VocallBase with our custom web application for human-in-the-loop training of voice activity networks like WhisperSeg that we optimize to detect voice activity across thousands of species. These efforts will enhance species monitoring, biodiversity assessment, ecological research, and will enable proactive strategies to preserve endangered species and habitats.

We also explore the relevance of our biological insights for natural language processing (NLP). We want to explore possible parallels between human language and birdsong and test the relevance of evolutionary vocal learning mechanisms for the processing of human language. We direct our NLP outreach efforts to helping researchers in assimilating the scientific literature and in generating scientific arguments. We deploy our NLP tools in user-friendly web application for assisted scientific writing, see https://endoc.ethz.ch.

We use our mixed educational backgrounds to advance knowledge at the boundary between biology and engineering, encompassing natural behavior, the organization of the songbird brain, computational theories of vocal learning, the evolution of language, and new methods for language processing.

Our research

Additional Information

Nianlong Gu, Kanghwi Lee, Maris Basha, Sumit Kumar Ram, Guanghao You, Richard H. R. Hahnloser, Positive transfer of the whisper speech transformation to human and animal voice activity detection

The challenge on Document-based Visual Question Answering (PDF-DQA) was awarded during the first Document Intelligence and Understanding (DocIU) Workshop at CIKM'23 (The 32nd International Conference on Information and Knowledge Management, Birmingham, UK). Congratulations!

L. Rüttimann, J. Rychen, T. Tomka, H. Hörster, M. D. Rocha, R.H.R. Hahnloser. 2022. bioRxiv

Yingqiang Gao, Nianlong Gu, Jessica Lam, and Richard H.R. Hahnloser. 2022. Do Discourse Indicators Reflect the Main Arguments in Scientific Papers?. In Proceedings of the 9th Workshop on Argument Mining, pages 34–50, International Conference on Computational Linguistics.

Nianlong Gu, Elliott Ash, Richard H.R. Hahnloser MemSum: Extractive Summarization of Long Documents using Multi-step Episodic Markov Decision Processes, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), Long papers., 2022

https://aclanthology.org/2022.acl-long.450/

May 2021: A system for controlling vocal communication networks.

More about May 2021: A system for controlling vocal communication networks.

J. Rychen, D.I. Rodrigues, T. Tomka, L. Rüttimann, H. Yamahachi and R.H.R. Hahnloser (2021). A system for controlling vocal communication networks. Scientific Reports, 11(1).

https://www.nature.com/articles/s41598-021-90549-0

2020: New funding - NCCR on Evolving Language

More about 2020: New funding - NCCR on Evolving Language

The Swiss National Centre of Competence in Research (NCCR) Evolving Language is a nationwide interdisciplinary research consortium bringing together research groups from the humanities, from language and computer science, the social sciences, and the natural sciences at an unprecedented level.

Quicklinks

Main navigation

Birdsong and Natural Language Group

uzh-wcms-publications.subpageListDialog.labelUnterseiten

Additional Information

Paper accepted at IEEE ICASSP 2024:

Nov 2023, Yingqiang Gao was awarded with the DocIU Cup Award (1st Place)

Multimodal system for recording individual-level behaviors in songbird groups

Oct 2022: New paper

May 2022: New paper

May 2021: A system for controlling vocal communication networks.

2020: New funding - NCCR on Evolving Language