10 out of 200: Escaping the dependency hell – Fernando Chirigati improves computational reproducibility
Meet Fernando Chirigati (on the righthand side in the above photo, together with the ReproZip team), computer scientist and one of this year’s 10 out of 200 young researchers participating in the 7th Heidelberg Laureate Forum from September 22nd – 27th, 2019.
What are your name and nationality?
My name is Fernando Chirigati, and I’m Brazilian.
Where did you study and where are you currently based?
I received a B.E. in Computer and Information Engineering from the Federal University of Rio de Janeiro (UFRJ), located in Brazil. Later, I came to New York City to pursue a Ph.D. degree in Computer Science at NYU Tandon School of Engineering.
What is your current position?
I’m a Postdoctoral Research Associate at NYU Tandon School of Engineering and NYU Center for Data Science.
What is the focus of your research? What is your research project?
My research is mainly focused on data science and the different challenges that arise when handling data-intensive experiments. As an example, one of my research projects is related to reproducibility and how we can automate and facilitate this process to make it a standard practice in computational and data science. In another research project, the goal is to develop a search engine to help data scientists and subject matter experts to discover, explore, and integrate datasets available across the Web.
Why did you become a computer scientist?
I was 11 when my family first bought our own home computer. While I was mainly using it to play games and to discover the wonders of the Web on Sunday afternoons (when no one was using the shared family line), it didn’t take long for me to turn my curiosity inwards, that is, into how the components of a computer actually work. It was during my undergraduate program that I found my passion for research, when I applied for a young research scholarship and began working with data provenance and data-intensive applications.
What are some of the fundamental challenges you have faced in your academic career?
The biggest one is certainly lack of time: we often want to work on many projects and interesting research ideas, but there is only so much time. Knowing how to prioritize projects is a prime skill that I have yet to master.
What do you feel are the greatest pressures facing scientists today?
In my opinion, the pressure of publishing is the greatest one, and the most harmful. Scientists are expected to write and publish an increasing amount of papers, which I believe negatively impacts science: instead of publishing papers for the good of science, our main goal becomes to bring attention to sponsoring institutions and to succeed in a very competitive environment. Furthermore, this culture encourages scientists to cut corners: research outcomes are often not thoroughly verified when under the pressure of paper deadlines, and then are hardly reproducible.
What are you doing besides research?
I recently started taking Surfset classes, which are surf-inspired workouts that help me stay active and more relaxed. During my free time, I love baking cakes and hanging out with my family (dog included!) and friends.
How did you hear about the HLF and why did you apply?
My supervisor shared the call for applications and suggested for me to apply. I thought it would be a great opportunity to meet not only today’s greatest minds in Computer Science and Mathematics, but also the exceptional young researchers from all around the globe, i.e., the potential future giants in our fields.
What do you expect from this meeting?
I would love to share my work with people from different backgrounds and areas and to gather insightful feedback from varying perspectives on reproducible science. I’m also looking forward to interacting with the laureates and hearing about their amazing journeys and experiences. Finally, I want to learn about all the interesting projects that the selected young researchers are working on. It’s going to be an exciting week!
Which laureates present at the forum would you really like to talk to and what do you want to ask them?
I would love to exchange ideas with Whitfield Diffie and Martin Hellman, who introduced the pioneering public-key cryptographic system, and Vinton Gray Cerf, one of the fathers of the Internet. I’m very interested in hearing about what inspired them to develop such important technologies, and how they think science has changed throughout these years.
Who were your most important mentors and what lessons did they pass on to you?
Professor Juliana Freire, my Ph.D. advisor and current supervisor, and Professor Marta Mattoso, who mentored me during my young research scholarship in Brazil. They both have a great passion for teaching and research and always incentivized me to do my best in my field.
You developed ReproZip, a tool for data management. What is this tool about and why do you think it fosters reproducibility and transparency of science?
Reproducibility of computational processes is hard due to the many software dependencies one has to deal with, what it is colloquially known as “dependency hell.” Researchers struggle to understand all the chains of dependencies their experiments have, and it is equally troublesome to install everything and reproduce and reuse the corresponding results in a different machine. ReproZip was developed to close this gap: it can automatically track all the dependencies of a computational experiment, create a self-contained and lightweight bundle, and re-create the same experiment in a different machine, all with just a few commands or clicks. In other words, ReproZip makes it easy to create and share reproducible results, fostering transparency of science. For whoever is interested in the tool (and I hope you all are!), I highly recommend taking a look at our website.